Logical Methods in Computer Science 
Vol. 5 (1:3) 2009, pp. 1-38 
www.lmcs-online.org 



Submitted Apr. 22, 2008 
Published Feb. 19, 2009 



THE SAFE LAMBDA CALCULUS 

WILLIAM BLUM AND C.-H. LUKE ONG 

Oxford University Computing Laboratory - School of Informatics, University of Edinburgh, UK 
e-mail address: william.blum@comlab.ox.ac.uk 

Oxford University Computing Laboratory, Oxford, UK 
e-mail address: luke.ong@comlab.ox.ac.uk 



Abstract. Safety is a syntactic condition of higher-order grammars that constrains oc- 
currences of variables in the production rules according to their type-theoretic order. In 
this paper, we introduce the safe lambda calculus, which is obtained by transposing (and 
generalizing) the safety condition to the setting of the simply-typed lambda calculus. In 
contrast to the original definition of safety, our calculus does not constrain types (to be 
homogeneous). We show that in the safe lambda calculus, there is no need to rename 
bound variables when performing substitution, as variable capture is guaranteed not to 
happen. We also propose an adequate notion of /3-reduction that preserves safety. In the 
same vein as Schwichtenberg's 1976 characterization of the simply-typed lambda calculus, 
we show that the numeric functions representable in the safe lambda calculus are exactly 
the multivariate polynomials; thus conditional is not definable. We also give a characteri- 
zation of representable word functions. We then study the complexity of deciding beta-eta 
equality of two safe simply-typed terms and show that this problem is PSPACE-hard. Fi- 
nally we give a game-semantic analysis of safety: We show that safe terms are denoted by 
F '-incrementally justified strategies. Consequently pointers in the game semantics of safe 
A-terms are only necessary from order 4 onwards. 



Introduction 

Background. The safety condition was introduced by Knapik, Niwiriski and Urzyczyn 
at FoSSaCS 2002 [TO] in a seminal study of the algorithmics of infinite trees generated 
by higher-order grammars. The idea, however, goes back some twenty years to Damm 
|10| who introduced an essentially equivalent syntactic restriction (for generators of word 
languages) in the form of derived types. A higher-order grammar (that is assumed to be 
homogeneously typed) is said to be safe if it obeys certain syntactic conditions that constrain 
the occurrences of variables in the production (or rewrite) rules according to their type- 
theoretic order. Though the formal definition of safety is somewhat intricate, the condition 

1998 ACM Subject Classification: F.3.2, F.4.1. 

Key words and phrases: lambda calculus, higher-order recursion scheme, safety restriction, game 
semantics. 

Some of the results presented here were first published in TLCA proceedings [8j. 
1 See de Miranda's thesis Q2I f° r a proof. 
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itself is manifestly important. As we survey in the following, higher-order safe grammars 
capture fundamental structures in computation and offer clear algorithmic advantages: 

• Word languages. Damm and Goerdt [TT] have shown that the word languages generated 
by order-n safe grammars form an infinite hierarchy as n varies over the natural numbers. 
The hierarchy gives an attractive classification of the semi-decidable languages: Levels 0, 
1 and 2 of the hierarchy are respectively the regular, context-free, and indexed languages 
(in the sense of Aho [5]), although little is known about higher orders. 

Remarkably, for generating word languages, order-n safe grammars are equivalent to 
order-n pushdown automata [TT], which are in turn equivalent to order-n indexed gram- 
mars pi ES] . 

• Trees. Knapik et al. have shown that the Monadic Second Order (MSO) theories of trees 
generated by safe (deterministic) grammars of every finite order are decidable^]. 

They have also generalized the equi-expressivity result due to Damm and Goerdt [11] 
to an equivalence result with respect to generating trees: A ranked tree is generated by an 
order-n safe grammar if and only if it is generated by an order-n pushdown automaton. 

• Graphs. Caucal [9] has shown that the MSO theories of graphs generated^! by safe gram- 
mars of every finite order are decidable. Recently Hague et al. have shown that the MSO 
theories of graphs generated by order-n unsafe grammars are undecidable, but deciding 
their modal mu-calculus theories is n-EXPTIME complete \17\ . 

Overview. In this paper, we examine the safety condition in the setting of the lambda 
calculus. Our first task is to transpose it to the lambda calculus and express it as an 
appropriate sub-system of the simply-typed theory. A first version of the safe lambda 
calculus has appeared in an unpublished technical report [3]. Here we propose a more 
general and cleaner version where terms are no longer required to be homogeneously typed 
(see Section [T] for a definition). The formation rules of the calculus are designed to maintain 
a simple invariant: Variables that occur free in a safe A-term have orders no smaller than 
that of the term itself. We can now explain the sense in which the safe lambda calculus is safe 
by establishing its salient property: No variable capture can ever occur when substituting 
a safe term into another. In other words, in the safe lambda calculus, it is safe to use 
capture-permitting substitution when performing /3-reduction. 

There is no need for new names when computing /3-reductions of safe A-terms, because 
one can safely "reuse" variable names in the input term. Safe lambda calculus is thus 
cheaper to compute in this naive sense. Intuitively one would expect the safety constraint 
to lower the expressivity of the simply-typed lambda calculus. Our next contribution is to 
give a precise measure of the expressivity deficit of the safe lambda calculus. An old result 
of Schwichtenberg [M] says that the numeric functions representable in the simply-typed 
lambda calculus are exactly the multivariate polynomials extended with the conditional 
function. In the same vein, we show that the numeric functions representable in the safe 
lambda calculus are exactly the multivariate polynomials. 

2 It has recently been shown [30] that trees generated by unsafe deterministic grammars (of every finite 
order) also have decidable MSO theories. More precisely, the MSO theory of trees generated by order-n 
recursion schemes is n-EXPTIME complete. 

These are precisely the configuration graphs of higher-order pushdown systems. 
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Our last contribution is to give a game-semantic account of the safe lambda calculus. 
Using a correspondence result relating the game semantics of a A-term M to a set of tra- 
versal |30j over a certain abstract syntax tree of the ry-long form of M (called computation 
tree), we show that safe terms are denoted by F '-incrementally justified strategies. In such 
a strategy, pointers emanating from the P-moves of a play are uniquely reconstructible 
from the underlying sequence of moves and the pointers associated to the O-moves therein: 
Specifically, a P-question always points to the last pending O-question (in the P-view) of a 
greater order. Consequently pointers in the game semantics of safe A-terms are only neces- 
sary from order 4 onwards. Finally we prove that a /3-normal A-term is safe if and only if 
its strategy denotation is (innocent and) P -incrementally justified. 



Higher-order safe grammars. We first present the safety restriction as it was originally 
defined [19]. We consider simple types generated by the grammar A ::= o \ A — ► A. By 
convention, — > associates to the right. Thus every type can be written as A\ — ► ■ ■ ■ — > A n — ► 
o, which we shall abbreviate to (A±, ■ ■ ■ , A n , o) (in case n = 0, we identify (o) with o). We 
will also use the notation A n — > B for every types A, B and positive natural number n > 
defined by induction as: A 1 -> B = A -> B and A n+1 -> B = A -> (A n -> S). The order 
of a type is given by ordo = and ord(^4 — ► B) = max(ordA + l,ord-B). We assume an 
infinite set of typed variables. The order of a typed term or symbol is defined to be the 
order of its type. The set of applicative terms over a set of typed symbols is defined as its 
closure under the application operation (i.e., if M : A — > B and N : A are in the closure 
then so does MN : B). 

A (higher-order) grammar is a tuple (S, J\f, 7Z, S), where £ is a ranked alphabet (in 
the sense that each symbol / £ S is assumed to have type o r — > o where r is the arity of 
/) of terminals; M is a finite set of typed non-terminals; S is a distinguished ground- type 
symbol of TV", called the start symbol; 1Z is a finite set of production (or rewrite) rules, one 
for each non-terminal F : (A\, . . . , A n , o) E TV, of the form i 7 ^! . . . z m — > e where each 
(called parameter) is a variable of type j4j and e is an applicative term of type o generated 
from the typed symbols in S U U {zi, . . . , z m }. We say that the grammar is order-n just 
in case the order of the highest-order non-terminal is n. 

We call higher- order recursion scheme a higher-order grammar that is deterministic 
(i.e., for each non-terminal F E J\f there is exactly one production rule with F on the left 
hand side). Higher-order recursion schemes are used as generators of infinite trees. The 
tree generated by a recursion scheme G is a possibly infinite applicative term, but 
viewed as a S-labelled tree; it is constructed from the terminals in S, and is obtained by 
unfolding the rewrite rules of G ad infinitum, replacing formal by actual parameters each 
time, starting from the start symbol S. See e.g. |19j for a formal definition. 

Example 1.1. Let G be the following order-2 recursion scheme: g 
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where the arities of the terminals g,h,a are 2,1,0 respectively. The tree 
generated by G is defined by the infinite term g a(g a(h(h(h ■ ■ ■ ))))■ 
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A type (Ai, ■ ■ ■ ,A n ,o) is said to be homogeneous if ordAi > ord^2 > ••• > ovdA n , 
and each A±, A n is homogeneous [19]. We reproduce the following Knapik et al.'s 
definition [19]. 

Definition 1.2 (Safe grammar). (All types are assumed to be homogeneous.) A term of 
order k > is unsafe if it contains an occurrence of a parameter of order strictly less than k, 
otherwise the term is safe. An occurrence of an unsafe term t as a subexpression of a term 
t' is safe if it is in the context • • • (ts) • • • , otherwise the occurrence is unsafe. A grammar 
is safe if no unsafe term has an unsafe occurrence at a right-hand side of any production. 

Example 1.3. (i) Take H : ((o, o), o) and / : (o, o, o); the following rewrite rules are unsafe 
(In each case we underline the unsafe subterm that occurs unsafely): 

p{(o,o),o,o,o) ZX y f(F (Fzy) y(zx))x 
(ii) The order-2 grammar defined in Example 11.11 is unsafe. 



Safety adapted to the lambda calculus. We assume a set 3 of higher-order constants. 
We use sequents of the form r M : A to represent term-in-context where T is the 
context and A is the type of M . For convenience, we shall omit the superscript from 
whenever the set of constants S is clear from the context. The subscript in \-§ specifies 
which type system is used to form the judgement: We use the subscript 'st' to refer to the 
traditional system of rules of the Church-style simply-typed lambda calculus augmented 
with constants from S. We will introduce a new subscripts for each type system that we 
define. For simplicity we write (Ai, • • • , A n , B) to mean A\ — > ■ ■ ■ — > A n — > B, where B is 
not necessarily ground. 

Definition 1.4. (i) The safe lambda calculus is a sub-system of the simply-typed lambda 
calculus. It is defined as the set of judgements of the form T h s M : A that are derivable 
from the following Church-style system of rules: 

, x / x , ^ , , x r K M : A „ 

(var) (const) / G H (wk) T C A 

V ' x : A h s x : A K 1 h s f : A J V J Ah s M :A 

. , T h asa M : A -> B r h s N : A ... V h s M : A 

( a PPas) fH »r Ar o W 



r h asa MN:B K ' r h asa M : A 

, x r H asa M : A^ B Th 5 N : A 

app) otclB < ordT 

V ^' T h s MiV : B 

(abs) r.an:^,.. x w :A n h asa M:5 < 
1 ^ rh s A^...^.M:(A 1 ,...,A n , J B) ^ ^ 

where ordT denotes the set {ordy : y G T} and "c < 5" means that c is a lower-bound of 
the set S. The subscripts in h s and h aS a stand for "safe" and "almost safe application" . 

(ii) The sub-system that is defined by the same rules in (i), such that all types that 
occur in them are homogeneous, is called the homogeneous safe lambda calculus. 

(iii) We say that a term M is safe if the judgement T h s M : T is derivable in the safe 
lambda calculus for some context T and type T. 
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The safe lambda calculus deviates from the standard definition of the simply-typed 
lambda calculus in a number of ways. First the rule (abs) can abstract several variables 
at once. (Of course this feature alone does not alter expressivity.) Crucially, the side 
conditions in the application rule and abstraction rule require the variables in the typing 
context to have orders no smaller than that of the term being formed. We do not impose 
any constraint on types. In particular, type-homogeneity, which was an assumption of the 
original definition of safe grammars [19], is not required here. Another difference is that we 
allow H-constants to have arbitrary higher-order types. 

Example 1.5 (Kierstead terms). Consider the terms M\ = \f^ ' ^' \f(Xx .f(Xy°.y)) and 
M2 = \f(( ' ^' \f(\x .f(\y°.x)). The term M2 is not safe because in the subterm f(\y°.x), 
the free variable x has order which is smaller than ord(Ay°.x) = 1. On the other hand, 
Mi is safe. 

It is easy to see that valid typing judgements of the safe lambda calculus satisfy the 
following simple invariant: 

Lemma 1.6. IfT h s M : A then every variable in T occurring free in M has order at least 
ordM. 

Definition 1.7. A term is an almost safe applications if it is safe or if it is of the form 
N\ . . . N m for some m > 1 where N\ is not an application and for every 1 < i < m, iVj is 
safe. 

A term is almost safe if either it is an almost safe application, or if it is of the form 
Ax^ 1 . . . x^ n .M for n > 1 and some almost safe application M. 

An almost safe application is not necessarily safe but it can be used to form a safe term 
by applying sufficiently many safe terms to it. An almost safe term can be turned into 
a safe term by either applying sufficiently many safe terms (if it is an application), or by 
abstracting sufficiently many variables (if it is an abstraction). 

We have the following immediate lemma: 

Lemma 1.8. A term M is 

(i) an almost safe application iff there is a derivation ofT h asa M : T for some T,T; 

(ii) almost safe iff V h asa M : T or if M = Xxf 1 . . . x^ n .N and V h asa N : T for some 

r,r. 

In particular, terms constructed with the rule (app as ) are almost safe applications. 

When restricted to the homogeneously- typed sub-system, the safe lambda calculus cap- 
tures the original notion of safety due to Knapik et al. in the context of higher-order gram- 
mars: 

Proposition 1.9. Let G = (S, J\f, 7Z, S) be a grammar and let e be an applicative term 
generated from the symbols in J\f U X U { z^ 1 , • • • , z^ m } . A rule Fz\ . . TZ is safe 

(in the original sense of Knapik et al.J if and only if z\ : A\, ■ ■ ■ , z m : A m h^ Uj ^ e : o is a 
valid typing judgement of the homogeneous safe lambda calculus. 

Proof. We show by induction that 
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(i) z\ , . . . , z m h asa t : A is a valid judgement of the homogeneous safe lambda calculus 
containing no abstraction if and only if in the Knapik sense, all the occurrences of unsafe 
subterms of t are safe occurrences. 

(ii) z±, . . . , z m h s t : A is a valid judgement of the homogeneous safe lambda calculus 
containing no abstraction if and only if in the Knapik sense, all the occurrences of unsafe 
subterms of t are safe occurrences, and all parameters occurring in t have order greater than 
ordi. 

The constant and variable rule are trivial. Application case: By definition, a term to . . . t n 
is Knapik-safe iff for all < i < n, all the occurrences of unsafe subterms of ti are safe 
occurrences (in the Knapik sense), and for all 1 < j < n, the operands occurring in tj 
have order greater than ordij. The (app as ) rule and the induction hypothesis permit us to 
conclude. 

Now since e is an applicative term of ground type, the previous result gives: z\, . . . , z m h s 
e : o is a valid judgement of the homogeneous safe lambda calculus iff all the occurrences 
of unsafe subterms of e are safe occurrences, which by definition of Knapik-safety is in turn 
equivalent to saying that the rule Fz\ . . . z m — > e is safe. □ 

In what sense is the safe lambda calculus safe? It is an elementary fact that when per- 
forming /3-reduction in the lambda calculus, one must use capture- avoiding substitution, 
which is standardly implemented by renaming bound variables afresh upon each substi- 
tution. In the safe lambda calculus, however, variable capture can never happen (as the 
following lemma shows). Substitution can therefore be implemented simply by capture- 
permitting replacement, without any need for variable renaming. In the following, we write 
M{N/x} to denote the capture-permitting substitution^ of N for x in M. 

Lemma 1.10 (No variable capture). There is no variable capture when performing capture- 
permitting substitution of N for x in M provided that T,x : B h s M : A and Y h s N : B are 
valid judgements of the safe lambda calculus. 

Proof. We proceed by structural induction on M. The variable, constant and application 
cases are trivial. For the abstraction case, suppose M = Xy.R where y = y\ . . . y p . If x G y 
then M{N/x} = M and there is no variable capture. 

Otherwise, x y. By Lemma 11.81 R is of the form M\ . . . M m for some m > 1 
where Mi is not an application and for every 1 < % < m, Mi is safe. Thus we have 
M{N/x} = \y.Mi{N/x) . . . M m {N/x}. Let i G {l..m}. By the induction hypothesis there 
is no variable capture in Mi{N/x}. Thus variable capture can only happen if the follow- 
ing two conditions are met: (i) x occurs freely in Mj, (ii) some variable yt for 1 < i < p 
occurs freely in N. By Lemma 11.61 (ii) implies ord yi > ord N = ord x and since x ^ y, 
condition (i) implies that x occurs freely in the safe term Xy.R thus by Lemma [L6l we have 
ordx > ord Xy.R > 1 + ordyj > ordyj which gives a contradiction. □ 

Remark 1.11. A version of the No- variable-capture Lemma also holds in safe grammars, as 
is implicit in (for example Lemma 3.2 of) the original paper |19| . 

Example 1.12. In order to contract the /?-redex in the term 

/: (o,o,o),x :oh st (X V ^x°.ip x)(f x) : (o, o) 

This substitution is done by textually replacing all free occurrences of x in M by N without performing 
variable renaming. In particular for the abstraction case we have (Ayi . . . y n -M){N/x} — \y\ . . . y n .M{N/x} 
when x {yi . . . y„}. 



THE SAFE LAMBDA CALCULUS 



V 



one should rename the bound variable x to a fresh name to prevent the capture of the free 
occurrence of x in the underlined term during substitution. Consequently, by the previous 
lemma, the term is not safe (because ordx = < 1 = ord/x). 

Note that A-terms that 'satisfy' the No-variable-capture Lemma are not necessarily 
safe. For instance the /?-redex in \y°z° .{\x° .y)z can be contracted using capture-permitting 
substitution, even though the term is not safe. 

Related work: In her thesis [12], de Miranda proposed a different notion of safe lambda 
calculus. This notion corresponds to (a less general version of) our notion of homogeneous 
safe lambda calculus. It can be showed that for pure applicative terms (i.e., with no lambda- 
abstraction) the two systems coincide. In particular a version of Proposition 11.91 also holds 
in de Miranda's setting |12| . In the presence of lambda abstraction, however, our system is 
less restrictive. For instance the term \f( '°' ^x° .fx : (o, o) is typable in the homogeneous 
safe lambda calculus but not in the safe lambda calculus d la de Miranda. One can show 
that de Miranda's system is in fact equivalent to the homogeneous long-safe lambda calculus 
(i.e., the restriction of the system of Def. 11.211 to homogeneous types). 

Safe beta reduction. From now on we will use the standard notation M [N/x] to denote 
the substitution of N for x in M. It is understood that, provided that M and N are safe, 
this substitution is capture-permitting. 

Lemma 1.13 (Substitution preserves safety). Let T h s N : B. Then 

(i) r, x : B h 5 M : A implies T h s M[N/x] : A; 

(ii) T,x : B H asa M : A implies T h asa M[N/x] : A. 

This is proved by an easy induction on the structure of the safe term M. 

It is desirable to have an appropriate notion of reduction for our calculus. However the 
standard /5-reduction rule is not adequate. Indeed, safety is not preserved by /3-reduction as 
the following example shows. Suppose that w, z : o and / : (o, o,o) € S then the safe term 
{\x°y° .fxy)zw /3-reduces to (Xy°.fzy)w, which is unsafe since the underlined first-order 
subterm contains a free occurrence of the ground-type variable z. However if we perform 
one more reduction we obtain the safe term fzw. This suggests simultaneous contraction 
of "consecutive" /3-redexes. In order to define this notion of reduction we first introduce 
the corresponding notion of redex. 

In the simply-typed lambda calculus a redex is a term of the form (\x.M)N. In the 
safe lambda calculus, a redex is a succession of several standard redexes: 

Definition 1.14. A safe redex is an almost safe application of the form 

(Xxf 1 . . . x^ n .M)Ni ...Ni 

for I, n > 1 such that M is an almost safe application. (Consequently each N{ is safe as well 
as Xx 1 1 . . . x^ n .M, and M is either safe or is an application of safe terms.) 

For instance, in the case n < I, a safe redex has a derivation tree of the following form: 
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T',x:Ah s 


M : 


(An+i, ■ ■ ■ 


,A h B) 


V h s Ax^ 1 . . . 


■ -^n 


M : 


(Ax,. 


..,A h B) 


r h s Xxf 1 . . . 




M : 




■ ■ ,Ai,B) 


r i~ asa \x 1 1 . . 




*.M 


■■(Ax, 


...,A h B) 



(abs) 
(wk) 
(8) - 



T h asa ( Xxf 1 . . . xt-M)N x :{A 2 ,...A l ,B) 



r h s N x : A x 
(appas) 



(app a 



T\- asa (XxfK..x^.M)N 1 ...N l - 1 :(A l ,B) T h s JV, : A t 

a \ ( a PP) 

T h s (Xxf 1 . . . xfr.M)N x ...N r .B 

A safe redex is by definition an almost term, but it is not necessarily a safe term. For 
instance the term (Xx°y° .x)z is a safe redex but it is only an almost safe term. The reason 
why we call such redexes "safe" is because when they occur within a safe term, it is possible 
to contract them without braking the safety of the whole term. Before showing this result, 
we first need to define how to contract safe redexes: 

Definition 1.15 (Redex contraction). We use the abbreviations N = 

N\...N[. The relation /3 S (when viewed as a function) is defined on the set of safe re- 
dexes as follows: 

P s = { (Xxf 1 . . . x^ n .M)Ni . . . TVj h-> Xxf^ 1 . . . x^ n .M [N/xi ...xi] \ n > 1} 
U { (Xxf 1 . . . x^.M)N! ...N t ^M[Ni... N n /x] N n+1 ...Ni\n<l}. 

where M [Ri . . . R^/zi . . . z/~] denotes the simultaneous substitution in M of R\,. . . ,Rk for 
zi, . . .,z k . 

Lemma 1.16 (/3 s -reduction preserves safety). Suppose that M\(3 S M2- Then 

(i) M2 is almost safe; 

(ii) If Mi is safe then so is 

Proof. Let M\ (3 S M 2 for some safe redex M\ and term M 2 of type A. By definition, Mi is 
of the form (Xxf 1 . . . x^ n .M)N\ . . . Ni for some safe terms N\, . . . , Ni and almost safe term 
M of type C such that (Axf 1 . . . x^ n .M) is safe. 

— Suppose n > I then A = (-B2+1, . . . ,B n ,C). (i) By the Substitution Lemma [1.131 the 
term M \N/x\ . . . xA is an almost safe application: we have T, x\ + i : -B/+1, . . .x n : B n h asa 
M [iV /x\ . . . xi\ : C . (Indeed, if M is safe then we apply the Substitution Lemma once; 
otherwise it is of the form R\ . . . R q where Ri is a safe term and we apply the lemma on 

each Ri.) Thus by definition, Ax^ 1 . . . x^ n .M \N /x\ . . .xi] = M 2 is almost safe. 

(ii) Suppose that M\ is safe. W.l.o.g. we can assume that the last rule used to 
form Mi is (app) (and not the weakening rule (wk)), thus the variables of the typing 
context r are precisely the free variables of Mi, and Lemma 11.61 gives us ord^4 < 
ordT. This allows us to use the rule (abs) to form the safe term-in-context T h s 
Xxf^ 1 . . . x^ n .M \N/xi ...xi] =M 2 :A. 

— Suppose n < I. (i) Again by the Substitution Lemma we have that M [N\ . . . N n /x] is an 
almost safe application: T h asa M [Ni . . . N n /x] : C. If n = I then the proof is finished; 
otherwise (n < I) we further apply the rule (app as ) I — n times which gives us the almost 
safe application T h asa M 2 : A. 
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(ii) Suppose that M\ is safe. If n = I then M2 = M [N\ . . . N n /x] is safe by the 
Substitution Lemma; If n < I then we obtain the judgement T h s M% : A by applying 
the rule (app as ) I — n — 1 times on T h s M [N\ . . . N n /x] : C followed by one application 
of (app). □ 

We can now define a notion of reduction for safe terms. 

Definition 1.17. The safe (3 -reduction, written — > / g s , is the compatible closure of the 
relation (5 S with respect to the formation rules of the safe lambda calculus {i.e., it is the 
smallest relation such that if M\ (3 S M2 and C[M] is a safe term for some context C[— ] 
formed with the rules of the simply-typed lambda calculus then C[Mi] — >a C[M2]). 

Lemma 1.18 {f5 s -reduction preserves safety). If V h s M\ : A and M\ -^p 3 M2 then 
r h s M 2 : A. 

Proof. Follows from Lemma 1 1.1 61 by an easy induction. □ 

Lemma 1.19. The safe reduction relation — M s .' 

(i) is a subset of the transitive closure of —>p (— >p s (Z^»p); 

(ii) is strongly normalizing; 

(iii) has the unique normal form property; 

(iv) has the Church-Rosser property. 

Proof, (i) Immediate from the definition: Safe /^-reduction is just a multi-step /^-reduction, 
(ii) This is because — >g s C-»g and, — >p is strongly normalizing in the simply-typed A- 
calculus. (iii) It is easy to see that if a safe term has a beta-redex if and only if it has 
a safe beta-redex (because a beta-redex can always be "widen" into consecutive beta-redex 
of the shape of those in Def. I1.15H . Therefore the set of /3 s -normal forms is equal to the 
set of /3 s -normal forms. The uniqueness of (5- normal form then implies the uniqueness of 
/3 s -normal form, (iv) is a consequence of (i) and (ii). □ 



Eta-long expansion. The ?7-long normal form (or simply 77-long form) of a term is obtained 
by hereditarily ^-expanding the body of every lambda abstraction as well as every subterm 
occurring in an operand position {i.e., occurring as the second argument of some occurrence 
of the binary application operator). Formally the n-long form, written \M~\, of a (type- 
annotated) term M of type {A\, . . . , A n , o) with n > is defined by cases according to the 
syntactic shape of M: 

\Xx T .N] = Xx T .\N] 

\xNi...N m ] = \Tp J .x\N 1 ]...\N m ]\^ 1 ]...\< Pn ] 

\{Xx T .N)N 1 ...M p ] = \lp X .{\x r .\N-\)\N 1 ]...\N p -}\cp 1 ]...\cp n -\ 

where m > 0, p > 1, x is either a variable or constant, Tp = ipi . . . ip n and each ipt : A\ is 
a fresh variable. The binder notation ^ Xip A, stands for 'A^^ 1 . . . (/?^ n ' if n > 1, and for 'A' 
(called the dummy lambda) in the case n = 0. The base case of this inductive definition lies 
in the second clause for m = n = 0: \x~\ = X.x. 

Remark 1.20. This transformation does not introduce new redexes therefore the rj- long 
normal form of a /3-normal term is also /3-normal. 

Let us introduce a new typing system: 
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Definition 1.21. We define the set of long-safe terms by induction over the following 
system of rules: 

, \ r h M : A 

<" 3r '» x : A h, x : A <C ° nStl) h7-4 /£2 < Wk| > A R M : A F C A 

r h M:(A u ...,A n ,B) THN^ ... r h AT. : A, ordB < ordr 
V ^ 1 rh MNl-.Nu : B 

Y ,xi : A\, . . . ,x n : A n \-\ M : B 
(absi) ^ ^ ordMi, . . . , A n , ±?) < ordr 

The subscript in h| stands for "long-safe". This terminology is deliberately suggestive 
of a forthcoming lemma. Note that long-safe terms are not necessarily in jy-long normal 
form. 

Observe that the system of rules from Def . 11.211 is a sub-system of the typing system of 
Def. 11.41 where the application rule is restricted the same way as the abstraction rule (i.e., 
it can perform multiple applications at once provided that all the variables in the context 
of the resulting term have order greater than the order of the term itself). Thus we clearly 
have: 

Lemma 1.22. // a term is long-safe then it is safe. 

In general, long-safety is not preserved by //-expansion. For instance we have h| Xy°z°.y : 
(o, o, o) but performing one eta-expansion produces the term \x°.(\y°z°.y)x : (o, o, o) which 
is not long-safe. On the other hand, ^-reduction (of one variable) preserves long-safety: 

Lemma 1.23 (77-reduction of one variable preserves long-safety). T h| \ip T .M <p> : A with p> 
not occurring free in s implies V \-\ M : A. 

Proof. Suppose V h| \ip T .M ip : A. If M is an abstraction then by construction of M is 
necessarily safe. If M = Nq . . . N p with p > 1 then again, since \<p T .Nq . . . N p ip is safe, 
each of the iVj is safe for < i < p and for every variable z occurring free in \<p.M (p, 
oidz > ord(A(/? T .M ip) = ordM. Since <p does not occur free in M, the terms M and 
Xp T .M <p have the same set of free variables, thus we can use the application rule to form 
r' hj Nq . . . N p : A where T' consists of the typing-assignments for the free variables of M. 
The weakening rules permits us to conclude V \-\ M : A. □ 

Lemma 1.24 (r/-long expansion preserves long-safety), r h| M : A then V \-\ \M~\ : A. 

Proof. First we observe that for every variable or constant x : A we have x : A h| \x~\ : A. 
We show this by induction on ordx. It is verified for every ground type variable x since 
x = \x~\ . Step case: x : A with A = {A\, . . . , A n , o) and n > 0. Let <pi : A{ be fresh variables 
for 1 < i < n. Since ord^4j < ordx the induction hypothesis gives (pi : Ai h| \ipi] : Aj. 
Using (wk|) we obtain x : A,Tp : A h| \<Pi\ : Ai. The application rule gives x : A,Tp : A \-\ 
x\ipi \ . . . \<Pn \ '■ o and the abstraction rule gives x : A \-\ \Tp.x\ipi \ . . . \(pn\ = \x] : A. 

We now prove the lemma by induction on M. The base case is covered by the previous 
observation. Step case: 

• M = xNi . . . N m with x : (Bi, ... , B m , A), A = (Ai, . . . , A n ,o) for some m > 0, n > 
and Ni : Bi for 1 < i < m. Let (pi : Ai be fresh variables for 1 < i < n. By the 
previous observation we have ipi : Ai \-\ \<p>i\ : Ai, the weakening rule then gives us 
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r, (p : A h| \<fi] : Ai. Since the judgement T h| xN\ . . . N m : A is formed using the 
(appi) rule, each Nj must be long-safe for 1 < j < m, thus by the induction hypothesis 
we have T \-\ \Nj~\ : Bj and by weakening we get T,Tp : A \-\ \Nj~\ : Bj. The (appi) 
rule then gives T,Tp : A h| xfAq] . . . \N m ] \(pi~] ■ ■ ■ \fn\ '■ °- Finally the (absi) rule gives 
T h| \Tp.x\Ni~\ . . . \N m ~\ \tpi \ ■ ■ ■ \<Pn1 — \M~] • A the side-condition of (absi) being verified 
since ord \s] = ords. 

• M = Nq . . . N m where Nq is an abstraction and m > 1. The eta- long normal form is 
\M~] = Alp. \Nq] . . . \N m ~] I (p{] ■ ■ ■ \(pn \ for some fresh variables (p±, . . . , (p n . Again, using 
the induction hypothesis we can easily derive T h| \M~\ : A. 

• M = \rj B .N where ./V of type C and is not an abstraction. The induction hypothesis 
gives r,f:5h| \N] : C and using (abs,) we get V h| \rj.\N] = \M] : A. □ 

Remark 1.25. 

(i) The converse of this lemma does not hold: performing ^-reduction over a large ab- 
straction does not in general preserve long-safety. (This does not contradict Lemma 
11.231 which states that safety is preserved when performing r/-reduction on an abstrac- 
tion of a single variable.) A counter-example is Xf^ ' ' ^g^ ' ' ^' \g(Xx°.fx), which is 
not long-safe but whose eta-normal form Xf^ ' ' ^g^ ' ' ^' \g(Xx°y .fxy) is long-safe. 
There are also closed terms in eta-normal form that are not long-safe but have an 
ry-long normal form that is long-safe! Take for instance the closed /^-normal term 

X f(o,{o,o),o,o) g ((o,o),o,o,o),o)^ Xy (o,o) x o j^y 

(ii) After performing r/-long expansion of a term, all the occurrences of the application rule 
are made long-safe. Thus if a term remains not long-safe after 77-long expansion, this 
means that some variable occurrence is not bound by the first following application of 
the (abs) rule in the typing tree. 

Lemma 1.26. A simply-typed term is safe if and only if its n-long normal form is long-safe. 

Proof. Let T h st M : T. We want to show that we have r h s M : T if and only if 
r h| |~M] : T. The 'Only if part can be proved by a trivial induction on the struc- 
ture of r h s M : T. For the 'if part we proceed by induction on the structure of the 
simply-typed term M: The variable and constant cases are trivial. Suppose that M is an 
application of the form xN\ . . . N m : A for m > 1. Its 77- long normal form is of the form 
x|~iVi] . . . \Nrn~] \(pi \ ■ ■ ■ \(pm] '■ o for some fresh variables ipi, ... (p m . By assumption this 
term is long-safe therefore we have ord A < ordT and for 1 < i < m, [iVj] is also long-safe. 
By the induction hypothesis this implies that the AjS are all safe. We can then form the 
judgement T h s xN± . . . N m : A using the rules (var) and (<5) followed by m — 1 applications 
of the rule (app as ) and one application of (app) (this is allowed since we have ord^4 < ordT). 
The case M = (Xx.N)Ni . . . N m for m > 1 is treated identically. 

Suppose that M = Xx B .N : A. By assumption, its r/-long n.f. Xx B lp c . \N~\ \(pi \ ■ ■ ■ \<p m ~] '■ 
A (for some fresh variables Tp = (p\ . . . (p m and types C = C\ . . . C m ) is long-safe. Thus 
we have ord A < ordT. Furthermore the long-safe subterm [iV] \(p{] ■ ■ ■ \(pm\ is pre- 
cisely the eta-long normal form of N (pi . . . (p m : o therefore by the induction hypothesis 
we have that iV^i ... i p m : o is safe. Since the (pi's are all safe (by rule (var)), we can 
"peel-off" to applications (performed using the rules (app as ) or (app)) from the sequent 
T,x : B,Tp : C \- s N pi . . . (p m : o which gives us the sequent T,x : B,Tp : C h asa N : A. Since 
the variables Tp are fresh for N, we can further peel-off applications of the weakening rule 
to obtain the judgement T,x : B \- s N : A. 
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Finally since we have ord^4 < ordT, we can use the rule (abs) to form the sequent 



r h s \x B .N : A. □ 

Proposition 1.27. A term is safe if and only if its rj-long normal form is safe. 

Proof. 

(If): r h s \M] : T r h \M] : T By Lemma [L26] (only if), 

=> T h s M : T By Lemma [L26] (if). 

(Only if): r h s M : T T h, \M] : T By Lemma [L26] (only if), 

=► r h s \M] : T By Lemma [OH 

□ 



The type inhabitation problem. It is well known that the simply-typed lambda cal- 
culus corresponds to intuitionistic implicative logic via the Curry-Howard isomorphism. 
The theorems of the logic correspond to inhabited types, and every inhabitant of a type 
represents a proof of the corresponding formula. Similarly, we can consider the fragment 
of intuitionistic implicative logic that corresponds to the safe lambda calculus under the 
Curry-Howard isomorphism; we call it the safe fragment of intuitionistic implicative logic. 

We would like to compare the reasoning power of these two logics, in other words, to 
determine which types are inhabited in the lambda calculus but not in the safe lambda 
calculus @ 

If types are generated from a single atom o, then there is a positive answer: Every 
type generated from one atom that is inhabited in the lambda calculus is also inhabited in 
the safe lambda calculus. Indeed, one can transform any unsafe inhabitant M into a safe 
one of the same type as follows: Compute the eta-long beta normal form of M. Let x be 
an occurrence of a ground- type variable in a subterm of the form Ax.C[x] where Ax is the 
binder of x and for some context C[— } different from the identity (defined as C[R] = R 
for all R). We replace the subterm Ax.Cfx] by Xx.x in M. This transformation is sound 
because both C[x] and x are of the same ground type. We repeat this procedure until 
the term stabilizes. This procedure clearly terminates since the size of the term decreases 
strictly after each step. The final term obtained is safe and of the same type as M. 

This argument cannot be generalized to types generated from multiple atoms. In 
fact there are order-3 types with only 2 atoms that are inhabited in the simply-typed 
lambda calculus but not in the safe lambda calculus. Take for instance the order-3 type 
(((6, a), b), ((a, b), a), a) for some distinct atoms a and b. It is only inhabited by the following 
family of terms which are all unsafe: 

A/ ((M ' b) 5 ((a '" ) ' a) .5(Ax?./(A 2 /tx 1 )) 

A/ ((M ' b) 5 {(a ' b) ' a) .9(A^./(A^. 5 (Ax^. yi ))) 

A/ ((M ' b) 5 ((a ' b) ' a) .9(Ax?./(A^. 5 (Ax^./(A^.x,))) where i = 1,2 
A/ ((M ' b) 5 {(a ' b) ' a) .9(Ax?./(A^. 5 (Ax^./(A^.< 7 (Ax^.y J ))) where i = 1, 2 



'This problem was raised to our attention by Ugo dal Lago. 
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Another example is the type of function composition. For any atom a and natural 
number n G N, we define the types n a as follows: a = a and (n + l) a = n a — > a. Take 
three distinct atoms a, b and c. For any k G N, we write o~(i,j, k) to denote the type 

fc ) = (*a J&) (i& fc c) ~>i a ^ k c . 

For all z, j, fc, this type is inhabited in the lambda calculus by the "function composition 
term" : 

Xxyz.y(xz) . 

This term is safe if and only if % > j (for the subterm xz is safe iff i = ord(i a ) = ordz > 
ord(x z) = ord(jfe) = j). In the case i < j, the type a(i,j,k) may still be safely inhabited. 
For instance <t(1, 3,4) is inhabited by the safe term 

\x la ^ 3 »y 3 »^ 4c z lc .y(x(\u a .u)) . 

The order-4 type cr(0,2,0), however, is only inhabited by the unsafe term Xxyz.y(xz). 

Statman showed [35J that the problem of deciding whether a type defined over an 
infinite number of ground atoms is inhabited (or equivalently of deciding validity of an 
intuitionistic implicative formula) is PSPACE-complete. The previous observations suggest 
that the validity problem for the safe fragment of implicative logic may not be PSPACE- 
hard. 

2. Expressivity 

2.1. Numeric functions representable in the safe lambda calculus. Natural num- 
bers can be encoded in the simply-typed lambda calculus using the Church Numerals: each 
n G N is encoded as the term n = Xs^°'°^ z°.s n z of type I = ((o, o), o, o) where o is a ground 
type. We say that a p-ary function / : N p — > N, for p > 0, is represented by a term 
F : (J, ... ,1,1) (with p + 1 occurrences of I) if for all nii G N, < i < p we have: 

F ra\. . .m^ =/3 /(mi, . . . ,m p ) . 
Schwichtenberg [35j showed the following: 

Theorem 2.1 (Schwichtenberg, 1976). The numeric functions representable by simply- 
typed lambda-terms of type J J using the Church Numeral encoding are exactly the 
multivariate polynomials extended with the conditional function. 

If we restrict ourselves to safe terms, the representable functions are exactly the multi- 
variate polynomials: 

Theorem 2.2. The functions representable by safe lambda-expressions of type I I 
are exactly the multivariate polynomials. 

Proof. Natural numbers are encoded as the Church Numerals: n = \sz.s n z for each 
n G N. Addition: For n, m G N, n + m = \a( '°^x° .(na)(max). Multiplication: n.m = 
\a( '°\n(rna). These terms are all safe, furthermore function composition can be safely en- 
coded: take a function g : N n — > N represented by safe term G of type I n — > I and functions 
/i, . . . , f n : N p — > N represented by safe terms F\,...F n respectively then the composed 
function (x\, ■ ■ ■ , x p ) i— > g(f\(x\, . . . , x p ), . . . , f n {x\, . . . , x p )) is represented by the safe term 
\c\ . . . Cp.G(F\C\ . . . Cp) . . . (F n ci . . . c p ). Hence any multivariate polynomial P{n\, . . . , n^) 
can be computed by composing the addition and multiplication terms as appropriate. 
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For the converse, let U be a safe lambda-term of type I — > I — > I. The generalization 
to terms of type J n — > I for every n G N is immediate (they correspond to polynomials with 
n variables). By Lemma 11.271 safety is preserved by 77- long normal expansion therefore we 
can assume that U is in 77-long normal form. 

Let .A/i? denote the set of safe 77-long /^-normal terms of type r with free variables in 
E, and A% for the set of /3-normal terms of type r with free variables in E and of the 
form <ps\ . . . s m for some variable <p : (A\, . . . , A m , 6) where m > and for all 1 < i < m, 
Si € JVU * . Observe that the set A% contains only safe terms but the sets A^ in general may 
contain unsafe terms. Let E denote the alphabet {x,y : I , z : o, a : o — > o}. By an easy 
reasoning (See the term grammar construction of Zaionc [37]), we can derive the following 
equations inducing a grammar over the set of terminals E U {\xyaz., Xz.} that generates 

precisely the terms of Af^ 1 ' : 

4°' o) -> a [ A^ a4°' o) 



a4°' 0) - Az.^ 



Al 



y 



The key rule is the fourth one: had we not imposed the safety constraint the right-hand side 
would instead be of the form ^ w °-A^^ w . o y Here the safety constraint imposes to abstract 
all the ground type variables occurring freely, thus only one free variable of ground type 
can appear in the term and we can choose it to be named z up to a-conversion. 

We extend the notion of representability to terms of type o, (o, 6) and / with free 
variables in E as follows: A function / : N 2 — > N is represented by (i) E h st F : o if and 
only if for all m, n € N, F[m,n/x,y] =p a^ m,n ^z; (ii) E h st G : (0,0) iff G[m,n/x,y] =g 
\ z _ a f(m,n) z . s h st if : J iff H[m,n/x, y] = ( g Xaz.oJ^^z. 

We now show by induction on the grammar rules that any term generated by the 
grammar represents some polynomial: Base case: The term x and y represent the projection 
functions (m, n) 1— ► m and (m, n) 1— > n respectively. The term a and z represent the constant 
functions (m, n) 1— ► 1 and (m, n) 1— > respectively, .Step case: The first and fourth rule are 
trivial: for F € A%, the terms Az.i 7 and Xxyaz.F represent the same function as F. We 
now consider the second and third rule. We observe that for m,p,p' > we have 

(i) m{Xz.a p z) =p Xz.a mp z; (ii) (Xz.a p z)(a p ' z) = p a p+p ' z . 

Suppose that F G A% and G € Af^' ^ represent the functions / and g respectively then 

by (i), FG represents the function / x g. If F G A^'°^ an d G € Af£ represent the functions 
/ and g then by (ii), FG represents the function / + g. 

Hence U represents some polynomial: for all m, n £ N we have U Win =p Xaz. 
where p(m,n) = Y2o<k<d m%kn3} ' f° r some ik,jk > 0, d > 0. □ 
Corollary 2.3. The conditional operator C . I — > I — > I —t I satisfying: 

y 13 \ z, ift^pn + l . 
is not definable in the simply-typed safe lambda calculus. 
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Example 2.4. The term XFGHax.F(Xy.Gax)(Hax) used by Schwichtenberg [34J to define 
the conditional operator is unsafe since the underlined subterm, which is of order 1, occurs 
at an operand position and contains an occurrence of x of order 0. 

Remark 2.5. 

(i) This corollary tells us that the conditional function is not definable when numbers are 
represented by the Church Numerals. It may still be possible, however, to represent 
the conditional function using a different encoding for natural numbers. One way to 
compensate for the loss of expressivity caused by the safety constraint is to introduce 
countably many domains of representation for natural numbers. Such a technique is 
used to represent the predecessor function in the simply-typed lambda calculus [14] . 

(ii) The boolean conditional can be represented in the safe lambda calculus as follows: 
We encode booleans by terms of type B = (o, o, o). The two truth values are then 
represented by Xx°y°.x and Xx°y°.y and the conditional operator is given by the term 
\F B G B H B x°y°.F (G xy){Hxy). 

(iii) It is also possible to define a conditional operator behaving like the conditional operator 
C in the second-order lambda calculus [13]: natural numbers are represented by terms 
n = At.Xs t ~* t z t .s n (z) of type J = At.(t — > t) — > (t — > t) and the conditional is encoded 
by the term XF J G J H J .F J (Xu J .G) H. Whether this term is safe or not cannot be 
answered just yet as we do not have a notion of safety for second-order typed terms. 

2.2. Word functions definable in the safe lambda calculus. Schwichtenberg's result 
on numeric functions definable in the lambda calculus was extended to richer structures: 
Zaionc studied the problem for word functions, then functions over trees and eventually the 
general case of functions over free algebras [201 EH EHJ EH 00] . In this section we consider 
the case of word functions expressible in the safe lambda calculus. 

Word functions. We consider a binary alphabet E = {a, b}. The result of this section 
naturally extends to all finite alphabets. We consider the set E* of all words over E. The 
empty words is denoted e. We write \w\ to denote the length of the word ueS', For any 
k £ N we write k to denote the word a ... a with k occurrences of a, so that |k| = k. For 
any n > 1 and k > 0, we write c(n, k) for the n-ary function (E*) n — > E* that maps all 
inputs to the word k. We consider various word functions. Let x,y,z be words over E: 

• Concatenation app : (E*) 2 — > E*. The word app(x,y) is the concatenation of x and y. 

• Substitution sub : (E*) 3 — > E*. The word sub(x,y, z) is obtained from x by substituting 
the word y for all occurrences of a and z for all occurrences of b. Formally: 

sub(e,y,z) = e , 

sub(ax, y, z) = app(y, sub(x, y, z)) , 

sub{bx, y, z) = app(z, sub(x, y, z)) . 

• Prefix-cut cut a : E* — > E*. The word cut a x is the maximal prefix of x containing only 
the letter 'a'. Formally: 

cut a (e) = e , 
cut a (ax) = app(a, cut a (x)) , 
cut a {bx) = e . 
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• Projections 7Tfc : (S*) n — ► £* for n > 1, 1 < k < n denned as vrfc(xi, . . . , x/., . . . ,x n ) = Xk- 

• Constant functions cst w : X* — > X* for ioGE*, mapping constantly onto the word w. 
Additional operations can be obtained by combining the above functions [39J: 

• Prefix-cut cutb : X* — > X* is defined by cut^x) = sub(cut a (sub(x,b,a)),b,a). 

• Non-emptiness check ~sq : X* — > X* (returns if the word is e and 1 otherwise) is defined 
by sq(x) = cut a (app(sub(x, b, b),a). 

• Emptiness check sq : X* — > X* is defined by sg(rc) = sg(sg(x)). 

• Occurrence check ocq : X* — > X* of the letter Z € X (returns 1 if the word contains an 
occurrence of I and otherwise) is defined by ocq(x) = sq(sub(x,l,e)). 

Representability. We consider equality of terms modulo a, (3 and r/ conversion, and we write 
M ={3r) N to denote this equality. For every simple type r, we write Cl(r) for the set of 
closed terms of type r (modulo a, (3 and r\ conversion). 

Take the type B = (o — > 6) — > (o — > o) — > o — > o, called the binary word type [37] . 
There is a 1-1 correspondence between words over £ and closed terms of type B. Think 
of the first two parameters as concatenators for 'a' and '6' respectively, and the third 
parameter as the constructor for the empty word. Thus the empty word e is represented by 
\u°^ v ~*°x .x; if w S X* is represented by a term W G C1(B) then a ■ w is represented by 
X u °-*° v °-^°x .u(Wuvx) and b ■ w is represented by An°~ >0 t; ^ a; .w(VFn?;x). For any word 
io £ E* we write w to denote the term representation obtained that way. We say that the 
word function h : (£*)" — > S* is represented by a closed term H G Cl(B n — > B) just if for 
all x\ , . . . , x n G B* , iifcci ■ ■ ■Xn=/3 rl hx\ . . . x n . 

Example 2.6. The word functions app,sub,cut a ,cutb,sq,~sq,occ a ,occb defined above are 
respectively represented by the following lambda-terms: 

APP = \cduvx.cuv(duvx), SUB = Xxdeuvx.c(Xy.duvy)(Xy.euvy)x, 

CUT a = Xcuvx.cu(Xy.x)x, CUT?, = Xcuvx.c(Xy.x)vx, 



Zaionc |37] showed that the A-definable word functions are generated by a finite base 
in the following sense: 

Theorem 2.7 (Zaionc [37])- The set of X- definable word functions is the minimal set con- 
taining: (i) the constant functions; (ii) the projections; (Hi) concatenation app; (iv) substi- 
tution sub; (v) prefix-cut cut a ; and closed by composition. 

The terms representing these basic operations are given in Example 12.61 We observe 
that among them, only APP and SUB are safe; the other terms are all unsafe because they 
contain terms of the form N(Xy.x) where x and y are of the same order. It turns out that 
APP and SUB constitute a base of terms generating all the functions definable in the safe 
lambda calculus as the following theorem states: 

Theorem 2.8. Let A e def denote the minimal set containing the following word functions 
and closed by composition: 

(i) the projections; 

(ii) the constant functions; 



SQ = Xcuvx.c(Xy.ux)(Xy.ux)x, 
OCC a = Xcuvx.c(Xy.ux)(Xy.y)x, 



SQ 
OCQ, 



Xcuvx.c(Xy.x) (Xy.x) (ux) 
Xcuvx.c(Xy.y)(Xy.ux)x. 
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(iii) concatenation app; 

(iv) substitution sub. 

The set of word functions definable in the safe lambda calculus is precisely A safe def. 

The proof follows the same steps as Zaionc's proof. The first direction is immediate: 
Projections are represented by safe terms of the form Xxi . . . x n .Xi for some i G {l..n}, and 
constant functions by Axi . . .x n .w for some w G E*. The terms APP and SUB are safe 
and represent concatenation and substitution. For closure by composition: take a function 
g : (£*) n — > £* represented by safe term G G Cl(B n — > B) and functions fi,...,f n '■ 
— ► X* represented by safe terms F\, . . . F n respectively then the function 

Ol,--" ,Xp) l-> g(fl(xi, . . . ,X p ), . . . , f n (xi, ■ ■ ■ ,X p )) 

is represented by the term Aci . . . c p .G{F\C\ . . . c p ) . . . {F n c\ . . . c p ) which is also safe. 

To show the other direction we need to introduce some more definitions. We will write 
Op(n, k) to denote the set of open terms M typable as follows: 

ci : B, . . . c„ : B, u : (o, o), v : (o, o), a^-i : o, . . . , x$ : o h st M : o . 

Thus we have the following equality (modulo a, (3 and rj conversions) for n, k > 1: 

Cl(r(n, k)) = {Acf . . . c^u^v^x ^ . . . x° .M \ M G Op(n, k)} 

writing r(n, k) as a shorthand for the type B n — ^ (o, o) 2 — > o fc — > o. We generalize the 
notion of representability to terms of type r(n, A;) as follows: 

Definition 2.9 (Function pair representation). A closed term T G Cl(r(n, fe)) represents 
the pair of functions (f,p) where / : (£*) n —> E* and p : (S*) n — > {0, . . . , k — 1} if for 
all ifi, . . . , w n G S* and for every i G {0 . . . , k — 1} we have: 

Twi ...Wn =/3 V \uVX k -X . . . XQ. fjwi, W n ) uVX\ p ( Wl> ... tWn )\ . 

By extension we will say that an open term M from Op(n, k) represents the pair (f,p) just 
if M[w\ . ..w n /ci ...c n ] =f3„ f(wi, . . . ,w n ) uvx\ p ( Wlt ^ Wri) \. 

We will call safe pair any pair of functions of the form (w, c(n, i)) where < i < k — 1 
and w is an n-ary function from A safe def. 

Theorem 2.10 (Characterization of the representable pairs). The function pairs repre- 
sentable in the safe lambda calculus are precisely the safe pairs. 

Proof. (Soundness). Take a pair (w, c(n,i)) where < i < k — 1 and w is an n-ary function 
from A safe def. As observed earlier, all the functions from A safe def are representable in the safe 
lambda calculus: Let w be the representative of w. The pair (w, c(n, i)) is then represented 
by the term Aci . . . c n uvxk-\ ■ ■ ■ xq.wci . . . c n uvxi. 

(Completeness) It suffices to consider safe /3-r/-long normal terms from Op(n, k) only. 
The result then follows immediately for every safe term in Cl(r(n,/c)). The subset of 
Op(n, k) consisting of /3-ry-long normal terms is generated by the following grammar [37J: 

(af) R k -> Xi 
((3 k ) | uR k 

(7 fc ) I vR k 
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Q k (R k+1 ) 

, " > 

(Sj) | cj (Xz k .R k+1 [z k ,x ,...,x k _ 1 /x ,x 1 ,...,x k ]) 

(Xz k .R k+1 [z k ,x , . . . , x^/xo, xi, . . • , x k \) 
R k 

for k > 1, < i < k, < j < n. The notation M[. ../...] denotes the usual simultaneous 
substitution. The non-terminals are R k for k > 1 and the set of terminals is {z k , Xz k \ k > 
1} U {xi \i > 0} U {ci, . . .,c n ,u,v}. 

The name of each rule is indicated in parenthesis. We identify a rule name with the 
right-hand side of the rule, thus a k belongs to Op(n,k), [3 k and ^ k are functions from 
Op(n, k) to Op(n, k), and 5 k is a function from Op(n, k + 1) x Op(n, k + 1) x Op(n, k) to 
Op(n, k). 

We now want to characterize the subset consisting of all safe terms generated by this 
grammar. The term a k is always safe; j3 k {M) and ^ k (M) are safe if and only if M is; and 
5j(F, G, H) is safe if and only if Q k (F), Q k (G) and H are safe. The free variables of Q k (F) 
belong to {ci, . . . c„, u, -y, xo, • • • x k } thus they have order greater than ordz except the XiS 
which have the same order as z. Hence since the XiS are not abstracted together with z we 
have that Q k (F) is safe if and only if F is safe and the variables xq . . . x k do not appear free 
in F[z k , xq, . . . , x k _i/xo, x±, . . . , x k ], or equivalently if the variables x\ . . . x k do not appear 
free in F. Similarly, Q k (G) is safe if and only if G is safe and the variables x\ . . . x k do not 
appear free in G. 

We therefore need to identify the subclass of terms generated by the non-terminal R k 
which are safe and which do not have any free occurrence of variables in {xi . . . x k -i}- By 
imposing this requirement to the rules of the previous grammar we obtain the following 
specialized grammar characterizing the desired subclass: 

(a k Q ) R k -► x 

(P k ) I uR k 

(l k ) I vR k 

(t) | Cj (Xz k .R k+1 [z k /x ]) (Xz k .R k+1 [z k /x }) if . 



For every term M, Q k (M) is safe if and only if M can be generated from the non-terminal 

R k . Thus 
grammar: 



k 

R . Thus the subset of Cl(r(n,/c)) consisting of safe beta-normal terms is given by the 



(ir k ) S — > Aci . . . c n uv Xfc_i . . . xo.R k 

(a k ) R k -> Xi 
k ) | uR k 

(7 fe ) | vR k 



(5 k ) | Cj (Xz k .R k + 1 [z k /x }) {Xz k .R k + 1 [z k /x }) R k 

To conclude the proof it thus suffices to show that every term generated by this grammar 
(starting with the non-terminal S) represents a safe pair. 
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We proceed by induction and show that the non-terminal R generates terms represent- 
ing pairs of the form (w, c(n, 0)) while non-terminals S and R k generate terms representing 
pairs of the form (w, c(n, i)) for < i < k and w €A safe def. 

Base case: The term q7q represents the safe pair (c(n, 0), c(n, 0)) while a k represents 
the safe pair (c(n, 0), c(n, i)). Step case: Suppose T £ Op(n,k) represents a pair (w,p). 

Then (3 (T) and (3 k (T) represent the pair (app(a,w),p); ^ k (T) and j k (T) represent the 
pair (app(b, w),p); and 7f k (T) £ Cl(r(n, k)) represents the pair (w,p). Now suppose that E, 
F and G represent the pairs (w e , c(n, 0)), (wj, c(n, 0)) and (w g , c(n,i)) respectively. Then 
we have: 

5 k (E,F,G)[w 1 ...Wn/c 1 ...c n ] 

= wj_ (Xz k .E[z k /xo])[w 1 . . . w n /ci ...Cn) 

{\z k .F[z k / Xq])[wi . . .Wn/Ci ...Cn] 
G[w\ . ..W n /ci ...Cn) 
=Pri Wj_ (Xz k .E[wi- ■ - Wn/Ci . . . C n ][z k / X \) 
(Xz k .F[wi . . . Wn/Cl ...Cn] [z k /x ]) 
( W g (wi . ..W n ) U V Xi) 

=i3rj Wj_ (Xz k .( w e (wi . ..W n ) U V X )[z k / X ]) 

(Xz h .( wf(w 1 . ..W n ) U V X )[z k /X ]) 

( w g (wi . . . W n ) U V Xi) 
=f3n Wj (Xz k .W e {w\ . . . W n ) U V Z k ) 



G represents (h,c(n,i)) 
E represents (/, c(n, 0)) 
F represents (g,c(n,0)) 



(Xz k .Wf(wi . . . w n ) u v z k ) 

( W g (wi . . . W n ) U V Xi) 
-n Wj (W e (wi . ..W n ) U V) (Wf(wi . . . W n ) U V) {Wg(W\ . . . W n ) U V Xi) 



,W n )) 



=p n WU V Xi 

where the word function w is defined as 

w : wi, ... ,w n app(sub(wj,w e (wi, . . . ,w n ),Wf(w\, . . . ,w n )),w g (xi, 

Hence S k (E,F,G) represents the pair (w,c(n,i)). 

The same argument shows that if E, F and G all represent safe pairs then so does 
tj(E,F,G). □ 

Theorem 12.81 is obtained by instantiating Theorem 12.101 with terms of types r(n, 1) = 
I n — > J: every closed safe term of this type represents some n-ary function from A safe def. 



2.3. Representability of functions over other structures. 

There is an isomorphism between binary trees and closed terms of type r = (o — > 
o — > o) — > o — > o. Thus a closed term of type r — > r r represents an n-ary 

function over trees. Zaionc gave a characterization of the set of tree functions representable 
in the simply-typed lambda calculus [38]: It is precisely the minimal set containing constant 
functions, projections and closed under composition and limited primitive recursion. Zaionc 
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showed that the same characterization holds for the general case of functions expressed over 
(different) free algebras [39| 140] (they are again given by the minimal set containing constant 
functions, projections and closed under composition and limited primitive recursion). This 
result subsumes Schwichtenberg's result on definable numeric functions as well as Zaionc's 
own results on definable word and tree functions. 

We have seen that constant functions, projections and composition can be encoded by 
safe terms. Limited primitive recursion, however, cannot be encoded in the safe lambda 
calculus (It can be used to define the conditional operator and the cut a word function). We 
expect an appropriate restriction to limited recursion to characterize the functions over free 
algebras representable in the safe lambda calculus. 

3. Complexity of the safe lambda calculus 

This section is concerned with the complexity of the beta-eta equivalence problem for 
the safe lambda calculus: Given two safe lambda-terms, are they equivalent up to j3rj- 
conversion? 

3.1. Statman's result. Let exp h (m) denote the tower-of-exponential function defined by 
induction as exp (m) = m and exp h+1 (m) = 2 e * Ph ( m \ A program is elementary recursive 
if its run-time can be bounded by exp^-(n) for some constant K where n is the length of 
the input. 

We recall the definition of finite type theory. We define T>q = {true, false} and T^k+i = 
&(T>k) (i.e., the powerset of £>&). For k > 0, we write x k , y k and z k to denote variables 
ranging over Prime formulae are x°, true € y 1 , false G y 1 , and x k € y k+1 . Formulae 
are built up from prime formulae using the logical connectives A,V,— >,—> and the quantifiers 
V and 3. Meyer showed that deciding the validity of such formulae requires nonelementary 
time J26]. 

A famous result by Statman states that deciding the /3r/-equality of two first-order 
typable lambda-terms is not elementary recursive [36]. The proof proceeds by encoding 
the Henkin quantifier elimination of type theory in the simply-typed lambda calculus and 
by appealing to Meyer's result [26]. Simpler proofs have subsequently been given: one by 
Mairson [23J and another by Loader [22J. Both proceed by encoding the Henkin quantifier 
elimination procedure in the lambda calculus, as in the original proof, but their use of list 
iteration to implement quantifier elimination makes them much easier to understand. 

It turns out that all these encodings rely on unsafe terms: Statman's encoding uses 
the conditional function sg which is not definable in the safe lambda calculus [8]; Mairson's 
encoding uses unsafe terms to encode both quantifier elimination and set membership, and 
Loader's encoding uses unsafe terms to build list iterators. We are thus led to conjecture 
that finite type theory (see definition in Sec. I3.2f) is intrinsically unsafe in the sense that 
every encoding of it in the lambda calculus is necessarily unsafe. Of course this conjecture 
does not rule out the possibility that another non-elementary problem is encodable in the 
safe lambda calculus. 
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3.2. Mairson's encoding. We refer the reader to Mairson's original paper [23J for a de- 
tailed account of his encoding. We show here why Mairson's encoding does not work in 
the safe lambda calculus. We then introduce a variation that eliminates some of the un- 
safety. Although the resulting encoding does not suffice to interpret type theory in the 
safe lambda calculus, it enables another interesting encoding: that of the True Quantifier 
Boolean Formula (TQBF) problem. This implies that deciding beta-eta equality of safe 
terms is PSPACE-hard. 

3.2.1. Sources of unsafety. In Mairson's encoding, boolean values are encoded by terms of 
type B = a — > a — > a for some type a, and variables of order k > are encoded by terms 
of type Afc defined as Ao = B and A^+i = (A& — > r — > r) — > r — > r for any type r. Using 
this encoding, unsafety manifests itself in three different places: 

(i) Set membership: The prime formula ll x k € y fc+1 " is encoded by a term-in-context of 
the form 

x : A k ,y : A fe+1 h st y{Xz A KM{x, z)) F : A k -» A k+1 -> A (3.1) 

for some term F and term M(x,z) containing free occurrences of x and z. This is 
unsafe because the free occurrence of x in M(x, z) is not abstracted together with z. 

(ii) Quantifier elimination is implemented using a list iterator D& + i of type A k +2 which 
acts like the f oldr function (from functional programming) over the list of all elements 
of T> k . Thus nested quantifiers in the formula are encoded by nested list iterations. 
This can be source of unsafety, for instance the formula "Vx°.3y°.x V y°" is encoded 
as 

h st Vo(^x Ao .AND(D (\y Ao .OR(x V y))F)) T : B 
for some terms AND, OR, F and T and where the type r is instantiated as B. This 
term is unsafe due to the underlined occurrence which is unsafely bound. 

More generally, nested binding will be encoded safely if and only if every variable x 
in the formula is bound by the first quantifier 3z or Vz satisfying ord z > ord x in the 
path to the root of the formula AST. So for example if set-membership were safely 
encodable then the interpretation of l Vx k 3y k+1 .x k S would be unsafe whereas 

that of l Vy k+1 3x k .x k € would be safe, 

(hi) Elements of the type hierarchy. The base set T>q of booleans is represented by a safe 
term Do of type Ao- Higher-order sets T> k for k > 1 are represented by unsafe terms 
D&: they are constructed from Do using a powerset construction that is unsafe. 

The second source of unsafety can be easily overcome, the idea is as follows. We 
introduce multiple domains of representation for a given formula. An element of T> k is 
thereby represented by countably many terms of type A£ where n £ N indicates the level 
of the domain of representation. The type A£ is defined in such a way that its order 
strictly increases as n grows. Furthermore, there exists a term that can lower the domain 
of representation of a given term. Thus each formula variable can have a different domain 
of representation, and since there are infinitely many such domains, it is always possible to 
find an assignment of representation domains to variables such that the resulting encoding 
term is safe. 

There is no obvious way to eliminate unsafety in the two other cases however. For 
instance in the case of set-membership, Mairson's encoding (]3.ip could be made safe by 
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appealing to a term that changes the domain of representation of an encoded higher-order 
value of the type-hierarchy. Unfortunately, such transformation is intrinsically unsafe! 

In the following paragraphs we present in detail a variation over Mairson's encoding in 
which quantifier elimination is safely encoded. 

3.2.2. Encoding basic boolean operations. Let o be a base type and define the family of 
types o"o = o, cr n +i = a n — ► a n satisfying ordcr^ = n. Booleans are encoded over domains 
B n = o~ n — > o — > o — > o for n > 0, each type B n being of order n+1. We write i n+ i to denote 
the term Ax CTn .x of type a n +i f° r n > 0. The truth values true and false are represented 
by the following terms parameterized by n € N: 

T n = \u an x°y°.x : B n 

F n = Xu an x y .y : B n . 

Clearly these terms are safe. Moreover the following relations hold for all n, n' > 0: 

\u^.T n+1 i_ n+l T n ' 

\u^'.F n+1 i n+l ^ p F n ' . 

It is then possible to change the domain of representation of a Boolean value from a higher- 
level to another arbitrary level using the conversion term: 

C n+W = Am B »+ 1 /«'.m i n+1 : B n+1 -> B n , 

so that if a term M of type B n , for n > 1, is beta-eta convertible to T n (resp. -F n ) then 
Cg 1- ™' M of type B n # is beta-eta convertible to T n ' (resp. F n '). 

Observe that although C^ +1 ^ n is safe for all n, n! > 0, if we apply a variable to it then 
the resulting term-in-context 

x : B n+1 h st C£ +W x : B n 

is safe if and only if ordB n+ i > ordi? ra /, that is to say if and only if the transformation 
decreases the domain of representation of x. 

Boolean functions are encoded by the following closed safe terms parameterized by n: 

AND n = Xp Bn q Bn u an x°y .p u (q u x y) y : B n -> B n -> B„ 

OR n = \p B "q Bn u a "x°y°.p u x (q u x y) : B n -> B n -> B n 

iVOT n = Xp Bn u' Tn x Xy .p « y x : B n -» B n -f B n . 



3.2.3. Coding elements of the type hierarchy. For every n G N we define the hierarchy of 
type A£ as follows: Aq = B n and A£ +1 = A^ - * where for a given type a, a* = (a — > r — > 
r) — ► r — > r for any type r. We encode an occurrence x k of a formula variable by a term 
variable x k of type A£ for some level of domain representation n £ N. Following Mairson's 
encoding, each set is represented by a list D£ consisting of all its elements: 

D£ = \c Bn ^ T ^ T e T .c T n (c F n e) : A? 
D fc+i = power set ^ D£ : A£ +2 

where 

power 'seta = AA K ' 
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A* double a {\c a '^ T ^ T b T .c (\c' a ^ T ^ T b' T .b') b) 
: ((a -> a" -» a**) -» a** -» a**) -» a** 

/(Ae a *.c (Ac' Q ^ r ^ T b' T .d x (e c' &')))(* c 6) 
— > a 

(In the definition of D£ +1 , to see why it is possible to apply power set a™ and DJ! one needs 
to understand that the term D£ is of type A^ +1 polymorphic in r. The application can 
thus be typed by taking r = A£ +2 in the term D£.) 

Observe that the term double is unsafe because the underlined variable occurrence x is 
not bound together with d. Consequently for all n > 0, Dq is safe and is unsafe for all 
k > 0. 



3.2.4. Quantifier elimination. Terms of type A^ +1 are now used as iterators over lists of 
elements of type A£ and we set r = B n in the type A^ +1 in order to iterate a level-ro 
Boolean function. Since ordA^ > ord B n for all n, all the instantiations of the terms 

will be safe (although the terms themselves are not safe for k > 1). Following 
[23], quantifier elimination interprets the formula \/x k .<&{x k ) as the iterated conjunction 
Cq^° (l)'%(\x A k .AN D n (<& x)) T n ^j where <£ is the interpretation of $ and n is the repre- 
sentation level chosen for the variable x k . Similarly we interpret 3x k .&(x k ) by the iterated 
disjunction Cg"° (vi(\x A k .AND n {$ x)) T n \ 

3.2.5. Encoding the formula. Given a formula of type theory, it is possible to encode it in 
the lambda calculus by inductively applying the above encodings of boolean operations and 
quantifiers on the formula; each variable occurrence in the formula being assigned some 
domain of representation. 

We now show that there exists an assignment of representation domains for each variable 

k k 

occurrence such that the resulting term is safe. Let x p p for p > 1 be the list of 

variables appearing in the formula, given in order of appearance of their binder in the 
formula (i.e., x p p is bound by the leftmost binder). We fix the domain of representation of 
each variable as follows. The right-most variable x kl is encoded in the domain A^ ; and if 
for 1 < i < p the domain of representation of x i * is Aj^. then the domain of representation 
of xfl 4 ; 1 is defined as Ai' where I' is the smallest natural number such that ordAi' is 
strictly greater than ordA^.. 

This way, since variables that are bound first have higher order, variables that are 
bound in nested list-iterations — corresponding to nested quantifiers in the formula — are 
guaranteed to be safely bound. 

Example 3.1. The formula Vx°.3y°.x° Vy°, which is encoded by an unsafe term in Mairson's 
encoding, is represented in our encoding by the safe term 

h s C 1 ^ (pi (Xx A o .AND°(Dq (\y A °.OR°(OR (Cj"° x) y)) F )) T 1 ) : B . 



21 



W. BLUM AND C.-H. L. ONG 



3.2.6. Set-membership. To complete the interpretation of prime formulae, we need to show 
how to encode set membership. Unfortunately, the introduction of multiple domains of rep- 
resentation does not permit us to completely eliminate the unsafety of Mairson's encoding 
of set membership. 

Indeed, adapting Mairson's encoding of set membership requires the ability to perform 
conversion of domains of representation for higher-order sets (not only for Boolean values). 
The conversion term Cq can be generalized to higher-order sets as follows: 

C££ n ' = Am A Hi/^ T " T / .m(\z A *w T .u{ Cl^ n ' z )w)v ■ A£ +1 -> A%' +1 

where k > 0. Unfortunately this term is safe if and only if n = n' (The largest underlined 
subterm is safe just when n > n' and the other underline subterm is safe just when n' >n). 
Hence at higher-orders, all the non-trivial conversion terms are unsafe. 

If the terms , k > 0, n ^ n' were safely representable then the encoding would 

go as follows: We set r = Bo in the types A^ +1 for all n, k > in order to iterate a level-0 
Boolean function. Firstly, the formulae "true £ y 1 " and "false € y 1 " can be encoded 
by the safe terms y 1 (Xx°.OR° x°)F° and y 1 (Xx°.OR°(NOT° x°))F° respectively. For the 
general case u x k € y fc+1 " we proceed as in Mairson's proof [23]: we introduce lambda-terms 
encoding set equality, set membership and subset tests, and we further parameterize these 
encodings by a natural number n. 

memberlll = Az A £ + ■ (C£+^ n y) (\z A * .O R° (eg£ (C" +lMn x) z)) F° 

• A k -» A fc+i B o 
subset1 +1 = \x A k+iy A k+i . x (Ax A £ .AND°(member% +1 x y)) T° 
: Afc +1 -> A£ +1 -> B 
egj = Ax B ".Ay B ".C^° (OR n (AND n x y)(AND n (NOT n x){NOT n y))) 

■ B n — > B n —> Bo 

eg£ +1 = Ax A fc+! y A fc+i.(Aop A fe+ 1 ^ A fe+ 1 ^ B °.AiVD (op x y)(op y x)) subset% +1 

■ A£ +1 -> A£ +1 -> B . 

The variables in the definition of eq^ +1 and subset^ +1 are safely bounds. Moreover, the 
occurrence of x in member^f^ is now safely bound — which was not the case in Mairson's 
original encoding — thanks to the fact that the representation domain of z is lower than that 
of x. The formula x k £ y k+1 can then be encoded as 

x : A£,y : A^ h st member^ (CST" *) (C&T* y) : B 

for some n,n' >2 and u = min(n, n') + 1. 

Unfortunately this encoding is not completely safe because, as mentioned before, the 
conversion term is unsafe for k > 1, n ^ u. We conjecture that the set-membership 

function is intrinsically unsafe. 



3.3. PSPACE-hardness. We observe that instances of the True Quantified Boolean For- 
mulae satisfaction problem (TQBF) are special instances of the decision problem for finite 
type theory. These instances correspond to formulae in which set membership is not allowed 
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and variables are all taken from the base domain T>q . As we have shown in the previous sec- 
tion, such restricted formulae can be safely encoded in the safe lambda calculus. Therefore 
since TQBF is PSPACE-complete we have: 

Theorem 3.2. Deciding /3rj- equality of two safe lambda-terms is PSPACE-hard. □ 

Example 3.3. Using the encoding where r is set to Bo in the types A£ for all k,n > 0, 
the formula Vx3y3z(x V y V z) A (px V ->j/ V -iz) is represented by the safe term: 

h s T) 2 (\x B2 .AND° 

(DKXy^.OR 

(D° (Xz Bo .OR° 

(AND°(OR°(OR° (C^° x) (Cg"° y))z) 

(OR°(OR\NEG Q {Cl^ x))(NEG°(Cl^° y)))(NEG° z))) 

)F°) 
)F°) 
)T° 
: B . 

Remark 3.4. The Boolean satisfaction problem (SAT) is just a particular instance of TQBF 
where formulae are restricted to use only existential quantifiers, thus the safe lambda cal- 
culus is also NP-hard. Asperti gave an interpretation of SAT in the simply-typed lambda 
calculus but his encoding relies on unsafe terms [6] . 

Remark 3.5. (i) Because the safety condition restricts expressivity in a non-trivial way, 
one can reasonably expect the beta-eta equivalence problem to have a lower complexity 
in the safe case than in the normal case; this intuition is strengthened by our failed 
attempt to encode type theory in the safe lambda calculus. No upper bounds is known 
at present. On the other hand our PSPACE-hardness result is probably a coarse lower 
bound; it would be interesting to know whether we also have EXPTIME-hardness. 

(ii) Statman showed [36] that when restricted to some finite set of types, the beta-eta 
equivalence problem is PSPACE-hard. Such result is unlikely to hold in the safe 
lambda calculus. This is suggested by the fact that we had to use the entire type 
hierarchy to encode TQBF in the safe lambda calculus. In fact we expect the beta-eta 
equivalence problem for safe terms to have a complexity lower than P SPACE when 
restricted to any finite set of types. 

(iii) The normalization problem ("Given a (safe) term M, what is its /3-normal form?") 
is non-elementary. Indeed, let t_2 = o and for n > —1, r n = r n _i — > r n _i. For 
k,n 6 N, let fc" denote the k th Church Numeral As T " n " 1 z Tn - 2 .s(- • • (s(s z) • • • ) (with k 
applications of s) of type r n . Then for n > 1, the safe term 2™ 1 2" 2 ■ • • 2° of type to 

has size 0{n) and its normal form exp n (l) has size C(exp n (l)). 

Thus in the simply-typed lambda calculus, beta-eta equivalence is essentially as 
hard as normalization. We do not know if this is the case in the safe lambda calculus. 

(iv) A related problem is that of beta-reduction: "Given a /3-normal term M\ and a term 
M2, does M2 /3-reduce to Mi?". It is known to be PSPACE-complete when restricted 
to order-3 terms [33J, but no complexity result is known for higher orders. The safe case 
can potentially give rise to interesting complexity characterizations at higher-orders. 
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4. A GAME-SEMANTIC ACCOUNT OF SAFETY 

Our aim is to characterize safety by game semantics. We shall assume that the reader 
is familiar with the basics of game semantics; for an introduction, we recommend Abramsky 
and McCusker's tutorial [3] . Recall that a justified sequence over an arena is an alternating 
sequence of O-moves and P-moves such that every move m, except the opening move, has a 
pointer to some earlier occurrence of the move mo such that mo enables m in the arena. A 
play is just a justified sequence that satisfies Visibility and Well-Bracketing. A basic result 
in game semantics is that A-terms are denoted by innocent strategies, which are strategies 
that depend only on the P-view of a play. The main result (Theorem l4.1ip of this section is 
that if a A-term is safe, then its game semantics (is an innocent strategy that) is, what we 
call, P -incrementally justified. In such a strategy, pointers emanating from the P-moves of a 
play are uniquely reconstructible from the underlying sequence of moves and pointers from 
the O-moves therein: Specifically a P-question always points to the last pending O-question 
(in the P-view) of a greater order. 

The proof of Theorem 14.111 depends on a Correspondence Theorem (see the Appendix) 
that relates the strategy denotation of a A-term M to the set of traversals over a souped-up 
abstract syntax tree of the f^-long form of M. In the language of game semantics, traversals 
are just (concrete representations of) the uncovering (in the sense of Hyland and Ong [18J) 
of plays in the strategy denotation. 

The useful transference technique between plays and traversals was originally introduced 
by the second author [30] for studying the decidability of monadic second-order theories 
of infinite structures generated by higher-order grammars (in which the S-constants or 
terminal symbols are at most order 1, and uninterpreted). In the Appendix, we present an 
extension of this framework to the general case of the simply-typed lambda calculus with 
free variables of any order. A new traversal rule is introduced to handle nodes labelled 
with free variables. Also new nodes are added to the computation tree to account for the 
answer moves of the game semantics, thus enabling the framework to model languages with 
interpreted constants such as PCF (by adding traversal rules to handle constant nodes). 

Incrementally-bound computation tree. In the context of higher-order grammars, the 
computation tree is defined as the unravelling of the finite graph representing the long 
transform of the grammar [30] . Similarly we define the computation tree of a A-term as an 
abstract syntax tree of its 77- long normal form. We write l{t\, . . . , t n ) with n > to denote 
the ordered tree with a root labelled I with n child-subtrees t\, . . . , t n . In the following we 
consider arbitrary simply- typed terms. 

Definition 4.1. The computation tree t(M) of a simply-typed term T h st M : T with 
variable names in a countable set V is a tree with labels in 

{@} U V U {Xx 1 . . .x n I xx, . . . ,x n € V, n G re- 
defined from its //-long form as follows. Suppose for n > then 
for m > 0, z G V: t(\x a .zs\ . . . s m : o) = \x(z(t(sx), . . . , r(s m ))) 
form>l: T{Xx A .(\y T .t) Sl . . . s m : o) = Ax(@(r(Ay T .t), r(si), . . . , r(s m )» . 
Example 4.2. Take h st \f ^°.(\u°-> .u)f : (o -» o) -> -> o. 
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Its 77-long normal form is 



H st Xf°^°z°. 

(Xu°^°v .u(X.v)) 
(Xy°.fy) 
(X.z) 

: (o — > o) — > o — > o 

v y 

Example 4.3. Take h st AW(°^ )^°).(Ax .?j(A,z .2))u : o -> ((o -> o) -> o) -> o. 

Its 77- long normal form is: Its computation tree is: 

A V v 

h st aW(°^°^°). 

(Ax°.w(Az°.x))n 
: o — > ((o — > o) — > o) — > o 



Even-level nodes are A-nodes (the root is on level 0). A single A-node can represent sev- 
eral consecutive variable abstractions or it can just be a dummy lambda if the corresponding 
subterm is of ground type. Odd-level nodes are variable or application nodes. 

The order of a node n, written ordn, is defined as follows: ©-nodes have order 0. The 
order of a variable-node is the type-order of the variable labelling it. The order of the root 
node is the type-order of (Ax, . . . , A p , T) where Ax, . . . ,A p are the types of the variables in 
the context T. Finally, the order of a lambda node different from the root is the type-order 
of the term represented by the sub-tree rooted at that node. 

We say that a variable node n labelled x is bound by a node m, and m is called the 
binder of n, if m is the closest node in the path from n to the root such that m is labelled 
A£ with x E £. 

We introduce a class of computation trees in which the binder node is uniquely deter- 
mined by the nodes' orders: 

Definition 4.4. A computation tree is incrementally-bound if for all variable node x, 
either x is bound by the first A-node in the path to the root with order > orda;, or x is a 
free variable and all the A-nodes in the path to the root except the root have order < ordx. 

Proposition 4.5 (Safety and incremental-binding). 

(i) If M is safe then t(M) is incrementally-bound. 

(ii) Conversely, if M is a closed simply-typed term andr(M) is incrementally-bound then 
M is safe. 

Proof, (i) Suppose that M is safe. By Lemma 11.271 the 77- long form of M is safe therefore 
t(M) is the tree representation of a safe term. 



Its computation tree is: 
Xfz 



Xuv 



Xy 

I f 
A A 
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In the safe lambda calculus, the variables in the context with the lowest order must be 
all abstracted at once when using the abstraction rule. Since the computation tree merges 
consecutive abstractions into a single node, any variable x occurring free in the subtree 
rooted at a node A£ different from the root must have order greater or equal to ordA£. 
Conversely, if a lambda node A£ binds a variable node x then ord A£ = 1 + max zG | ord z > 
ordx. 

Let x be a bound variable node. Its binder occurs in the path from x to the root, 
therefore, according to the previous observation, x must be bound by the first A-node 
occurring in this path with order > ordx. Let x be a free variable node then x is not 
bound by any of the A-nodes occurring in the path to the root. Once again, by the previous 
observation, all these A-nodes except the root have order smaller than ordx. Hence r is 
incrementally-bound. 

(ii) Let M be a closed term such that r(M) is incrementally-bound. W.l.o.g. we can 
assume that M is in 77-long form. We prove that M is safe by induction on its structure. 
The base case M = A£.x for some variable x is trivial. Step case: If M = X£.N\ . . . N p . Let 
i range over l..p. We have iVj = Xrji.N! for some non- abstraction term N-. By the induction 
hypothesis, X£.Ni = Xgrji.N- is a safe closed term, and consequently is necessarily safe. 
Let z be a free variable of N- not bound by Xrjl in iVj. Since r(M) is incrementally-bound 
we have ord z > ordAry! = ordA^j, thus we can abstract the variables ffi using (abs) which 
shows that iVj is safe. Finally we conclude h s M = X^.N\ . . . N p : T using the rules (app) 
and (abs). □ 

The assumption that M is closed is necessary. For instance for x,y : o, the computation 
trees r(Xxy.x) and r(Xy.x) are both incrementally-bound but Xxy.x is safe and Xy.x is not. 

P-incrementally justified strategy. We now consider the game-semantic model of the 
simply-typed lambda calculus. The strategy denotation of a term-in-context r h st M : T is 
written [r h st M : T]. We define the order of a move m, written ordm, to be the length of 
the path from m to its furthest leaf in the arena minus 1. (There are several ways to define 
the order of a move; the definition chosen here is sound in the current setting where each 
question move in the arena enables at least one answer move.) 

Definition 4.6. A strategy a is said to be P-incrementally justified if for every play 
sq £ a where q is a P-question, q points to the last unanswered O-question in r s~ l with 
order strictly greater than ordq. 

Note that although the pointer is determined by the P-view, the choice of the move 
itself can be based on the whole history of the play. Thus P-incremental justification does 
not imply innocence. 

The definition suggests an algorithm that, given a play of a P-incrementally justified 
denotation, uniquely recovers the pointers from the underlying sequence of moves and from 
the pointers associated to the O-moves therein. Hence: 

Lemma 4.7. In P-incrementally justified strategies, pointers emanating from P-moves are 
superfluous. 

Example 4.8. Copycat strategies, such as the identity strategy id a on game A or the 
evaluation map ev a,b of type (A => B) x A — > B, are all P-incrementally justified]! 



In such strategies, a P-move m is justified as follows: Either m points to the preceding move in the 
P-view or the preceding move is of smaller order and m is justified by the second last O-move in the P-view. 
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The Correspondence Theorem 16,101 gives us the following equivalence: 

Proposition 4.9. Let T h st M : T be a (3-normal term. The computation tree r(M) is 
incrementally-bound if and only if \T h st M : T] is F '-incrementally justified. 



Example 4.10. Consider the /3-normal term T h st f{\y.x) : o where y : o and V = 
f : ((o, o), o), x : o. The figure on the right represents its computation tree with the 
node orders given as superscripts. The node x is not incrementally-bound therefore 
r(/(Ay.x)) is not incrementally-bound and by Proposition 14.91 [T l~ s t f(Xy.x) : o] At/ 1 
is not incrementally-justified (although [r h st / : ((o, o),o)] and [r h st Ay.x : (o, o) 
are). 



r 



Propositions 14.51 and 14.91 allow us to show the following: 
Theorem 4.11 (Safety and P-incremental justification). 

(i) IfT\- s M:T then [r h s M : T] is P -incrementally justified. 

(ii) //l~st M : T is a closed simply-typed term and [h st M : T] is P -incrementally justified 
then the (3-normal form of M is safe. 

Proof, (i) Let M be a safe simply-typed term. By Lemma 11.181 its /3-normal form M' is 
also safe. By Proposition 14. 5f i). t(M') is incrementally-bound and by Proposition 14.91 [M'J 
is incrementally-justified. Finally the soundness of the game model gives [M] = [M'J. (ii) 
is a consequence of Lemma 11.181 Proposition 14.91 and I4.5f ii) and soundness of the game 
model. □ 

Putting Theorem I4.11( i) and Lemma 14.71 together gives: 

Proposition 4.12. In the game semantics of safe X-terms, pointers emanating from P- 
moves are unnecessary: they are uniquely recoverable from the underlying sequences of moves 
and from O-moves ' pointers. 

Example 4.13. If justification pointers are omitted then the denotations of the two Kier- 
stead terms from Example 11.51 are not distinguishable. In the safe lambda calculus this 
ambiguity disappears since M\ is safe whereas M2 is not. 

In fact, as the last example highlights, pointers are superfluous at order 3 for safe 
terms whether from P-moves or O-moves. This is because for question moves in the 
first two levels of an arena (initial moves being at level 0), the associated pointers are 
uniquely recoverable thanks to the visibility condition. At the third level, the question 
moves are all P-moves therefore their associated pointers are uniquely recoverable by P- 
incremental justification. This is not true anymore at order 4: Take the safe term-in-context 
-ip : (((o 4 ^ 3 )^ 2 )^ 1 ) h s ij)(\^ > ) .ipa) : o° for some constant a : o. Its strategy denotation 
contains plays whose underlying sequence of moves is qo qi q2 Q3 Q2 13 Q4- Since q^ is an O- 
move, it is not constrained by P-incremental justification and thus it can point to any of 
the two occurrences of (73 



More generally, a P-incrementally justified strategy can contain plays that are not "O-incrementally 
justified" since it must take into account any possible strategy incarnating its context, including those that 
are not P-incrementally justified. For instance in the given example, there is one version of the play that is 
not O-incrementally justified (the one where 54 points to the first occurrence of 53). This play is involved in 
the strategy composition [h st M2 ■ (((0,0), o), o)j; \tf) : (((o,o),o),o) h st ^p(Xip.ipa) : o] where M2 denotes the 
unsafe Kierstead term. 
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Towards a fully abstract game model. The standard game models which have been 
shown to be fully abstract for PCF [21 [18] are of course also fully abstract for the restricted 
language safe PCF. One may ask, however, whether there exists a fully abstract model with 
respect to safe context only Such model may be obtained by considering P-incrementally 
justified strategies — which have been shown to compose [7]. Its is reasonable to think that 
O-moves also needs to be constrained by the symmetrical O-incremental justification, which 
corresponds to the requirement that contexts are safe. This line of work is still in progress. 

Safe PCF and safe Idealised Algol. PCF is the simply-typed lambda calculus aug- 
mented with basic arithmetic operators, if-then-else branching and a family of recursion 
combinator Ya ■ ((A, A), A) for every type A. We define safe PCF to be PCF where the 
application and abstraction rules are constrained in the same way as the safe lambda cal- 
culus. This language inherits the good properties of the safe lambda calculus: No variable 
capture occurs when performing substitution and safety is preserved by the reduction rules 
of the small-step semantics of PCF. 

Correspondence. The computation tree of a PCF term is defined as the least upper-bound of 
the chain of computation trees of its syntactic approximants [3] . It is obtained by infinitely 
expanding the Y combinator, for instance r(Y(\fx.fx)) is the tree representation of the 
77-long form of the infinite term (Xfx.fx)((Xfx.fx)((Xfx.fx)(. . . 

It is straightforward to define the traversal rules modeling the arithmetic constants of 
PCF. Just as in the safe lambda calculus we had to remove ©-nodes in order to reveal the 
game-semantic correspondence, in safe PCF it is necessary to filter out the constant nodes 
from the traversals. The Correspondence Theorem for PCF says that the revealed game 
semantics is isomorphic to the set of traversals disposed of these superfluous nodes. This 
can easily be shown for term approximants. It is then lifted to full PCF using the continuity 
of the function Trv(_) from the set of computation trees (ordered by the approximation 
ordering) to the set of sets of justified sequences of nodes (ordered by subset inclusion). 
Finally computation trees of safe PCF terms are incrementally-bound thus we have 

Theorem 4.14. Safe PCF terms have P-incrementally justified denotations. □ 

Similarly, we can define safe IA to be safe PCF augmented with the imperative features 
of Idealized Algol (IA for short) [32] . Adapting the game-semantic correspondence and 
safety characterization to IA seems feasible although the presence of the base type var, 
whose game arena com N x exp has infinitely many initial moves, causes a mismatch between 
the simple tree representation of the term and its game arena. It may be possible to 
overcome this problem by replacing the notion of computation tree by a "computation 
directed acyclic graph" . 

The possibility of representing plays without some or all of their pointers under the 
safety assumption suggests potential applications in algorithmic game semantics. Ghica 
and McCusker [15] were the first to observe that pointers are unnecessary for representing 
plays in the game semantics of the second-order finitary fragment of Idealized Algol {IA2 
for short). Consequently observational equivalence for this fragment can be reduced to the 
problem of equivalence of regular expressions. At order 3, although pointers are necessary, 
deciding observational equivalence of I A3 is EXPTIME-complete [29^ I28j. Restricting the 
problem to the safe fragment of I A3 may lead to a lower complexity 
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5. Further work and open problems 

The safe lambda calculus is still not well understood. Many basic questions remain. 
What is a (categorical) model of the safe lambda calculus? Does the calculus have inter- 
esting models? What kind of reasoning principles does the safe lambda calculus support, 
via the Curry-Howard Isomorphism? Does the safe lambda calculus characterize a com- 
plexity class, in the same way that the simply-typed lambda calculus characterizes the 
polytime-computable numeric functions [21]? Is the addition of unsafe contexts to safe ones 
conservative with respect to observational (or contextual) equivalence? 

With a view to algorithmic game semantics and its applications, it would be interest- 
ing to identify sublanguages of Idealised Algol whose game semantics enjoy the property 
that pointers in a play are uniquely recoverable from the underlying sequence of moves. 
We name this class PUR. IA2 is the paradigmatic example of a PUR-language. Another 
example is Serially Re-entrant Idealized Algol [lj, a version of IA where multiple uses of 
arguments are allowed only if they do not "overlap in time" . We believe that a PUR lan- 
guage can be obtained by imposing the safety condition on I A3. Murawski [27] has shown 
that observational equivalence for IA4 is undecidable; is observational equivalence for safe 
IA4 decidable? 
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6. Appendix - Computation tree, traversals and correspondence 

The second author introduced the notion of computation tree and traversals over a 
computation tree for the purpose of studying trees generated by higher-order recursion 
scheme [30]. Here we extend these concepts to the simply-typed lambda calculus. Our 
setting allows the presence of free variables of any order and the term studied is not required 
to be of ground type. (This contrasts with [30] 's setting where the term is of ground type 
and contains only uninterpreted constant.) Note that we automatically account for the 
presence of uninterpreted constants since they can just be regarded as free variables. We 
will then state the Correspondence Theorem (Theorem 16. lOj) that was used in Sec. HI 

In the following we fix a simply-typed term-in-context T h st M : T (not necessarily 
safe) and we consider its computation tree r(M) as defined in Def. 14.11 

6.1. Notations. We first fix some notations. We write © to denote the root of the com- 
putation tree r(M). The set of nodes of this computation tree is denoted by IN. The sets 
IN@, IN\ and IN var are respectively the subset of ©-nodes, A-nodes and variable nodes. 
The type of a variable-labelled node is the type of the variable that labels it; the type of 
the root is (A\, . . . , A p , T) where x\ : A%, . . . , x p : A p are the variables in the context T; and 
the type of a node n E (IN\ U IN@) \ {©} is the type of the subterm of \M~\ corresponding 
to the subtree of r(M) rooted at n. 

6.2. Pointers and justified sequences of nodes. We define the enabling relation on 

the set of nodes of the computation tree as follows: m enables n, written m h n, if and only 
if n is bound by m (and we sometimes write m K n to indicate that n is the i th variable 
bound by m); or m is the root © and n is a free variable; or n is a A-node and m is its 
parent node. 

We say that a node no of the computation tree is hereditarily enabled by n p £ IN if 
there are nodes n±, . . . , n p _i E IN such that nj+i enables nj for alH 6 0..p — 1. 

For any set of nodes S, H C N we write S H ^ for {n£S \3m € H s.t. m h* n} - the 
subset of S consisting of nodes hereditarily enabled by some node in H . We will abbreviate 
S^ h into S mh . 

We call input-variables nodes the elements of IN®\^ (i.e., variables that are heredi- 
tarily enabled by the root of r(M)). Thus we have IN*£ = IN\ (IN^ @h U IN^ h ). 

A justified sequence of nodes is a sequence of nodes with pointers such that each 
occurrence of a variable or A-node n different from the root has a pointer to some preceding 
occurrence m satisfying m h n. In particular, occurrences of ©-nodes do not have pointer. 
We represent the pointer in the sequence as follows rrCTTh. where the label indicates that 
either n is labelled with the i th variable abstracted by the A-node m or that n is the i th 
child of m. Children nodes are numbered from 1 onward except for ©-nodes where it starts 
from 0. Abstracted variables are numbered from 1 onward. The i th child of n is denoted by 
n.i. 

We say that a node no of a justified sequence is hereditarily justified by n p if there 
are occurrences ni, . . . , n p _\ in the sequence such that n, points to nj+i for all i G 0..p — 1. 
For any occurrence n in a justified sequence s, we write s \ n to denote the subsequence of 
s consisting of occurrences that are hereditarily justified by n. 
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The notion of P-view r t~ l of a justified sequence of nodes t is defined the same way as 
the P-view of a justified sequences of moves in Game Semantics H 

r e n = e r s ■ m • . . . • A£ n = r s~ l ■ vrC^X^ 

for n £ IN\, r s ■ ra n = r s~ l • n r s • © n = © 

The O-view of s, written ls_i, is defined dually. We will borrow the game-semantic 
terminology: A justified sequences of nodes satisfies alternation if for any two consecutive 
nodes one is a A-node and the other is not, and P-visibility if every variable node points 
to a node occurring in the P-view a that point. 

6.3. Computation tree with value-leaves. We now add another ingredient to the com- 
putation tree that was not originally used in the context of higher-order grammars [30j . We 
write T> to denote the set of values of the base type o. We add value-leaves to r(M) as 
follows: For each value v € T> and for each node of the computation tree we attach a new 
child leaf v n to n. We write N for the set of nodes (i.e., inner nodes and leaf nodes) of the 
resulting tree. The set of leaf nodes is denoted L, we thus have N = IN U L. For $ ranging 
in {@, X,var}, we write N§ to denote the set consisting of nodes from IN§ together with 
leaf nodes with parent node in IN$; formally N$ = IN$ U {v n \ n £ IN$,v £ T>}. 

The basic notions can be adapted to this new version of computation tree: A value-leaf 
has order 0. The enabling relation h is extended so that every leaf is enabled^ by its parent 
node. A link going from a value-leaf v n to a node n is labelled by v (e.g., n777v n ). For the 
definition of P-view and visibility, value-leaves are treated as A-nodes if they are at an odd 
level in the computation tree, and as variable nodes if they are at an even level. 

We say that an occurrence of an inner node n 6 IN is answered by an occurrence 
v n if v n in the sequence that points to n, otherwise we say that n is unanswered. The 
last unanswered node is called the pending node. A justified sequence of nodes is well- 
bracketed if each value-leaf occurring in it is justified by the pending node at that point. 
If t is a traversal then we write ?(t) to denote the subsequence of t consisting only of 
unanswered nodes. 

6.4. Traversals of the computation tree. A traversal is a justified sequence of nodes of 
the computation tree where each node indicates a step that is taken during the evaluation 
of the term. 

Definition 6.1 (Traversals for simply-typed A-terms). The set Trv(M) of traversals over 
t(M) is defined by induction over the rules of Table [TJ A traversal that cannot be extended 
by any rule is said to be maximal. 



The equalities in the definition determine pointers implicitly. For instance in the second clause, if in 
the left-hand side, n points to some node in s that is also present in r s n then in the right-hand side, n points 
to that occurrence of the node in r s n . 



THE SAFE LAMBDA CALCULUS 35 



Initialization rules 

(Empty) e G T rv(M). 

(Root) The sequence constituted of a single occurrence of r(M)'s root is a traversal. 
Structural rules 

(Lam) If t • A£ is a traversal then so is t ■ A£ • n where n denotes A£'s child and: 

— If n € IN@ U INy, then it has no justifier; 

— if n € IN var \ JiVf v then it points to the only occurrenc^l of its binder in r t ■ A£ n ; 

— if n € INfy then it points to the only occurrence of the root © in r t ■ A£ n . 

(App) If t • @ is a traversal then so is t ■ @ ■ n. 
Input-variable rules 

(InputVar) If t is a traversal where t u £ IN^UL®^ and x is an occurrence of a variable 
node in l£j then so is t ■ n for every child A-node n of x, n pointing to x. 

V 

(InputValue) If ti ■x-ti is a traversal with pending node x G IN®^ then so is t\ ■ x ■ t% ■ v x 
for all ueP. 

Copy-cat rules i 

(Var) If t-n- Ax., traversal where X{ S IiV®fT then so is t ■ n ■ Ax . . . Xi ■ Xrji . 



(Value) lit - m- n777v n is a traversal where n £ IN then so is t ■ m ■ ri7T?v n ■ v m . 
Table 1: Traversal rules for the simply-typed A-calculus. 



"Prop. [6731 shows that P-views are paths in the tree thus n's enabler occurs exactly once in the P-view. 



A traversal always starts by visiting the root. Then it mainly 
follows the structure of the tree. The (Var) rule permits us to 
jump across the computation tree. The idea is that after vis- 
iting a variable node x, a jump is allowed to the node corre- 
sponding to the subterm that would be substituted for x if all 
the /3-redexes occurring in the term were reduced. The sequence 



Xrji 



y 



I 

.0 1 



Xrjr, 



"Ax 




A • @ ■ Xy . . . y 
on the right. 



is an example of traversal of the computation tree shown 



Example 6.2. The following justified sequence is a traversal of the computation tree of 
example 14.21 




t = Xfz ■ @ ■ Xuv - u-Xy-f-X-y-X-v-X-z . 

Proposition 6.3. ( Counterpart of the Path-traversal correspondence for higher-order gram- 
mars |31j proposition 6] .) Let t be a traversal. Then: 

(i) t is a well-defined and well-bracketed justified sequence; 

(ii) t is a well-defined justified sequence satisfying alternation, P-visibility and O-visibility; 

(iii) If t 's last node is not a value-leaf then r t~ 1 is the path in the computation tree going 
from the root to t 's last node. 
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The reduction of a traversal t is the subsequence of t obtained by keeping only oc- 
currences of nodes that are hereditarily enabled by the root ©. This has the effect of 
eliminating the "internal nodes" of the computation. If t is a non-empty traversal then 
the root © occurs exactly once in t thus the reduction of t is equal to t \ r where r is the 
first occurrence in t (the only occurrence of the root). We write Trv(M)^® for the set or 
reductions of traversals of M. 

Example 6.4. The reduction of the traversal given in example 16.21 is: 



t\Xfz = Xfz- f -X 

Application nodes are used to connect the operator and the operand of an application 
in the computation tree but since they do not play any role in the computation of the term, 
we can remove them from the traversals. We write t — © for the sequence of nodes-with- 
pointers obtained by removing from t all ©-nodes and value-leaves of ©-nodes, and where 
every pointer to an ©-node is replaced by a pointer to its immediate predecessor in t. We 
write T rv(M)~ & for the set {t - © | t € T rv(M)}. 

Example 6.5. Let t be the traversal given in example 16.21 we have: 




Xfz ■ Xuv -u-Xyf-X-y-X-v-X 



Remark 6.6. Clearly if M is /3-normal then r(M) does not contain any ©-node therefore all 
nodes are hereditarily enabled by the root and we have Trv(M)~® = Trv(M) = Trv(M) '®. 

Lemma 6.7. Suppose that M is a (3-normal simply-typed term. Let t be a non-empty 
traversal of M and r denote the only occurrence of t(M)'s root in t. Ift's last occurrence 
is not a leaf then 

C r = r?( t ) |- r n . 

In the lambda calculus without interpreted constants this lemma follows immediately 
from the fact that Trv(M) = Trv(M)^®. It remains valid in the presence of interpreted 
constants provided that the traversal rules implementing the constants are well-behaved 



6.5. Computation trees and arenas. We consider the well-bracketed game model of the 
simply-typed lambda calculus. We choose to represent strategies using "prefix-closed set of 
plays" o We fix a term T h st M : T and write [r h st M : T\ for its strategy denotation. 
The answer moves of a question q are written v q where v ranges in T>. 

Proposition 6.8. There exists a function (fM> constructive from M, that maps nodes from 
N\ (N@ UJVj;) to moves of the arenas underlying the strategy denotations of M 's subterms 
such that: 

traversal rule is well-behaved if it can be stated under the form "t = t\ ■ n • ti £ Trv(M) A = 
?(ti) ■ n A n £ INs U IN var A P(t) Am £ S(t) => ti ■ n^t^rn € T rv(M) v for some expression P expressing 
a condition on t and function 5* mapping traversals of the form of t to a subset of the children of n. 

1() In the literature, a strategy is commonly defined as a set of plays closed by taking a prefix of even 
length. However for the purpose of showing the correspondence with traversals, the "prefix-closed" -based 
definition is more adequate. 
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• <p maps X-nodes to O-questions, variable nodes to P-questions, value-leaves of \-nodes to 
P-answers and value-leaves of variable nodes to O-answers. 

• ip maps nodes of a given order to moves of the same order. 

If t = toti ... is a justified sequence of nodes in N\ U iV var then tp(t) is denned to be the 
sequence of moves tp(to) <p(ti) . . . equipped with the pointers of t. 

Example 6.9. Take Xx.(Xg.gx)(\y.y) with x,y : o and g : (0,0). The diagram below 
represents the computation tree (middle), the arenas [(0,0), o] (left), [0,0] (right), [o — ► o] 

(rightmost) and (p = tp U $)fggx ^ ^Xy'y^ (dashed-lines) . 

.x~ - _ _ 

\ - " -Ao Ay- - i> Xy . y - -QXy J x 

J — — —A I A 

q 91 —— _ T 



6.6. The Correspondence Theorem. In game semantics, strategy composition is per- 
formed using a CSP-like "composition + hiding". If some of the internal moves are not 
hidden then we obtain alternative denotations called revealed semantics [16] or interac- 
tion semantics |13j . We obtain different notions of revealed semantics depending on the 
choice of internal moves that we hide. For instance the fully revealed denotation of 
r h st M : T, written ((r h st M : T)), is obtained by uncovering all the internal moves from 
[r hgt M : T] that are generated during composition^ The inverse operation consists in 
filtering out the internal moves. 

The syntactically-revealed denotation, written ((r h st M : T)) s , differs from the 
fully-revealed one in that only certain internal moves are preserved during composition: 
When computing the denotation of an application joint by an @-node in the computation 
tree, all the internal moves are preserved. When computing the denotation of ((yiN\ . . . N p )) 
for some variable yi, however, we only preserve the internal moves of Ni, . . . , N p while omit- 
ting the internal moves produced by the copy-cat projection strategy denoting y^. 

The Correspondence Theorem states that in the simply-typed lambda calculus, the set 
Trv(M) of traversals of the computation tree is isomorphic to the syntactically-revealed 
denotation, and the set of traversal reductions is isomorphic to the standard strategy de- 
notation: 

Theorem 6.10 (The Correspondence Theorem). We have the following two isomorphisms: 

(i) ip M : Trv(M)- & «r h st M : T)) s 
(it) cpM ■ Trv(M) [T h st M :T\ . 



An algorithm that uniquely recovers hidden moves from [r h st M : T\ was given by Hyland and Ong 
PH Part II]. 
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Example 6.11. Take the term M = Xf^°h° .{Xg^ x.fx){Xy° .y){f z) of type ((0,0), 0,0). 
The figure below represents the computation tree (left tree), the arena [((o, 0), o, o)J (right 
tree) and the function tpM (dashed line). (Answer moves are not shown for clarity.) Take 
the traversal t given hereunder, we have: 



Xfz ipM 





Xfz ■ © ■ Xgx • /W • >x • API • / [4] • A^ >z 



t\ r = a/V/N :\m . >i . > z 

<PM{t \r) = q°- q^q 2 ■ q^q 2 ■ q 3 £ [M] 



This work is licensed under the Creative Commons Attribution-NoDerivs License. To view 
a copy of this license, visit http://creativecommons.0rg/iicenses/by-nd/2.o/ or send a 
letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA. 



